Monitoring GenAI Systems
Telemetry shape for LLM apps (input/output/cost/latency/tool calls per turn), what to alert on, and how the headline product metric drives everything below it.
Interview talking points
- What's the headline metric? Acceptance rate (clicks/carts/buys ÷ shown). Everything else is an input.
- What do you alert on? Acceptance-rate drop, p95 latency cliff, judge-vs-recall divergence, cost-per-request spike.
- Trace store schema. turn_id, session_id, input_mode, retrieved_chunks, tool_calls, model, tokens (in/out/cache R/W), cost, latency, accepted.
- OTel for GenAI. GenAI semantic conventions are still draft — call out the gap between what the spec says and what's stable.
Files in this folder
| File | Title |
|---|---|
| README.md | Monitoring GenAI Systems — AWS AIP-C01 Task 4.3 |
Back to the home page.