Monitoring GenAI Systems

Telemetry shape for LLM apps (input/output/cost/latency/tool calls per turn), what to alert on, and how the headline product metric drives everything below it.

Interview talking points

What's the headline metric? Acceptance rate (clicks/carts/buys ÷ shown). Everything else is an input.
What do you alert on? Acceptance-rate drop, p95 latency cliff, judge-vs-recall divergence, cost-per-request spike.
Trace store schema. turn_id, session_id, input_mode, retrieved_chunks, tool_calls, model, tokens (in/out/cache R/W), cost, latency, accepted.
OTel for GenAI. GenAI semantic conventions are still draft — call out the gap between what the spec says and what's stable.

Files in this folder

File	Title
README.md	Monitoring GenAI Systems — AWS AIP-C01 Task 4.3

Back to the home page.