13. Metrics - Measuring Success
Metrics Framework
graph TD
subgraph "Business Metrics"
A[Conversion Rate]
B[Add-to-Cart Rate]
C[Revenue per Chat Session]
D[Support Cost Reduction]
end
subgraph "User Experience Metrics"
E[Customer Satisfaction]
F[Resolution Rate]
G[Escalation Rate]
H[Session Duration]
end
subgraph "AI Quality Metrics"
I[Intent Accuracy]
J[Hallucination Rate]
K[Recommendation CTR]
L[RAG Relevance Score]
end
subgraph "Operational Metrics"
M[P99 Latency]
N[Error Rate]
O[Availability]
P[LLM Token Cost]
end
Detailed Metric Definitions
Business Metrics
| Metric |
Definition |
Target |
How Measured |
| Chatbot Engagement Rate |
Percent of store visitors who open the chatbot |
> 5% |
Frontend event tracking |
| Conversion Rate (Chat Users) |
Percent of chat sessions that result in a purchase |
> 8% |
Join chat sessions with order events within 24 hours |
| Add-to-Cart Rate |
Percent of product recommendations clicked "Add to Cart" |
> 15% |
Click tracking on product cards |
| Revenue per Chat Session |
Average revenue generated per chat session |
> $5.00 |
Attribution within 24 hours of chat |
| Support Cost Deflection |
Percent of support queries resolved without a human agent |
> 70% |
Escalation rate inverse |
| Average Order Value Lift |
Difference in AOV between chat users and non-chat users |
> +$3.00 |
A/B test with control group |
User Experience Metrics
| Metric |
Definition |
Target |
How Measured |
| CSAT |
User satisfaction rating |
> 4.2 / 5.0 |
Post-session survey shown to a small sample |
| Thumbs Up Rate |
Percent of responses receiving thumbs up |
> 60% |
In-chat feedback buttons |
| Resolution Rate |
Percent of sessions where the user's issue was fully resolved |
> 75% |
No escalation plus positive feedback or purchase |
| Escalation Rate |
Percent of sessions escalated to a human agent |
< 15% |
Escalation event count / total sessions |
| Repeat Usage Rate |
Percent of users who use the chatbot again within 30 days |
> 30% |
Session tracking by customer ID |
| Abandonment Rate |
Percent of sessions where the user leaves mid-conversation |
< 25% |
Session ends without resolution or purchase |
AI Quality Metrics
| Metric |
Definition |
Target |
How Measured |
| Intent Classification Accuracy |
Percent of messages correctly classified |
> 90% |
Weekly evaluation against a human-labeled test set |
| Hallucination Rate |
Percent of responses containing factually incorrect information |
< 2% |
Automated checks plus weekly human audit |
| RAG Relevance (Recall@3) |
Percent of queries where the correct document is in the top 3 retrieved chunks |
> 80% |
Evaluation against curated query-document pairs |
| Recommendation Click-Through Rate |
Percent of recommended products that users click |
> 25% |
Click tracking |
| Incorrect Response Rate |
Percent of responses flagged as wrong by users |
< 5% |
Feedback analysis |
| Out-of-Scope Rate |
Percent of messages the chatbot cannot handle |
< 10% |
Unknown intent or fallback triggered |
Operational Metrics
| Metric |
Definition |
Target |
How Measured |
| P50 Latency (first token) |
Median time to first token |
< 800ms |
CloudWatch metrics |
| P99 Latency (first token) |
99th percentile time to first token |
< 1.5s |
CloudWatch metrics |
| P99 Latency (full response) |
99th percentile time to complete response |
< 3s |
CloudWatch metrics |
| Error Rate |
Percent of requests resulting in error |
< 0.5% |
Error count / total requests |
| Availability |
Uptime percentage |
99.9% |
Health check monitoring |
| LLM Token Cost per Session |
Average cost of LLM calls per chat session |
< $0.03 |
Bedrock usage metrics |
| Guardrail Block Rate |
Percent of LLM responses blocked by guardrails |
< 5% |
Guardrail event logging |
| Circuit Breaker Trips |
Number of times circuit breakers open per day |
< 5 |
Circuit breaker event logging |
Dashboard Layout
MangaAssist - Real-Time Dashboard
---------------------------------
Active Sessions 12,453
Messages / Second 2,341
P99 Latency 1.2s
Error Rate 0.3%
Intent Distribution (Last Hour)
recommendation 35%
product_question 22%
faq 18%
order_tracking 12%
chitchat 8%
escalation 5%
Conversion Funnel (Today)
Visitors -> Chat Opens -> Products Shown -> Add to Cart -> Purchase
1,000,000 -> 52,000 -> 31,000 -> 8,400 -> 4,200
Quality (Last 24h)
Thumbs Up: 65%
Thumbs Down: 8%
No Feedback: 27%
Hallucinations Detected: 12
Escalations: 2,340
Weekly Business Review Metrics
Amazon runs Weekly Business Reviews. MangaAssist should report:
- Conversion rate for chat users vs. non-chat users.
- Revenue attributed to chat-assisted purchases.
- Top user intents to inform catalog and merchandising decisions.
- Escalation root causes to identify automation opportunities.
- Quality trend for hallucination rate and CSAT over time.
- Cost per session for LLM and compute usage.