LOCAL PREVIEW View on GitHub

13. Metrics - Measuring Success

Metrics Framework

graph TD
    subgraph "Business Metrics"
        A[Conversion Rate]
        B[Add-to-Cart Rate]
        C[Revenue per Chat Session]
        D[Support Cost Reduction]
    end

    subgraph "User Experience Metrics"
        E[Customer Satisfaction]
        F[Resolution Rate]
        G[Escalation Rate]
        H[Session Duration]
    end

    subgraph "AI Quality Metrics"
        I[Intent Accuracy]
        J[Hallucination Rate]
        K[Recommendation CTR]
        L[RAG Relevance Score]
    end

    subgraph "Operational Metrics"
        M[P99 Latency]
        N[Error Rate]
        O[Availability]
        P[LLM Token Cost]
    end

Detailed Metric Definitions

Business Metrics

Metric Definition Target How Measured
Chatbot Engagement Rate Percent of store visitors who open the chatbot > 5% Frontend event tracking
Conversion Rate (Chat Users) Percent of chat sessions that result in a purchase > 8% Join chat sessions with order events within 24 hours
Add-to-Cart Rate Percent of product recommendations clicked "Add to Cart" > 15% Click tracking on product cards
Revenue per Chat Session Average revenue generated per chat session > $5.00 Attribution within 24 hours of chat
Support Cost Deflection Percent of support queries resolved without a human agent > 70% Escalation rate inverse
Average Order Value Lift Difference in AOV between chat users and non-chat users > +$3.00 A/B test with control group

User Experience Metrics

Metric Definition Target How Measured
CSAT User satisfaction rating > 4.2 / 5.0 Post-session survey shown to a small sample
Thumbs Up Rate Percent of responses receiving thumbs up > 60% In-chat feedback buttons
Resolution Rate Percent of sessions where the user's issue was fully resolved > 75% No escalation plus positive feedback or purchase
Escalation Rate Percent of sessions escalated to a human agent < 15% Escalation event count / total sessions
Repeat Usage Rate Percent of users who use the chatbot again within 30 days > 30% Session tracking by customer ID
Abandonment Rate Percent of sessions where the user leaves mid-conversation < 25% Session ends without resolution or purchase

AI Quality Metrics

Metric Definition Target How Measured
Intent Classification Accuracy Percent of messages correctly classified > 90% Weekly evaluation against a human-labeled test set
Hallucination Rate Percent of responses containing factually incorrect information < 2% Automated checks plus weekly human audit
RAG Relevance (Recall@3) Percent of queries where the correct document is in the top 3 retrieved chunks > 80% Evaluation against curated query-document pairs
Recommendation Click-Through Rate Percent of recommended products that users click > 25% Click tracking
Incorrect Response Rate Percent of responses flagged as wrong by users < 5% Feedback analysis
Out-of-Scope Rate Percent of messages the chatbot cannot handle < 10% Unknown intent or fallback triggered

Operational Metrics

Metric Definition Target How Measured
P50 Latency (first token) Median time to first token < 800ms CloudWatch metrics
P99 Latency (first token) 99th percentile time to first token < 1.5s CloudWatch metrics
P99 Latency (full response) 99th percentile time to complete response < 3s CloudWatch metrics
Error Rate Percent of requests resulting in error < 0.5% Error count / total requests
Availability Uptime percentage 99.9% Health check monitoring
LLM Token Cost per Session Average cost of LLM calls per chat session < $0.03 Bedrock usage metrics
Guardrail Block Rate Percent of LLM responses blocked by guardrails < 5% Guardrail event logging
Circuit Breaker Trips Number of times circuit breakers open per day < 5 Circuit breaker event logging

Dashboard Layout

MangaAssist - Real-Time Dashboard
---------------------------------
Active Sessions          12,453
Messages / Second         2,341
P99 Latency               1.2s
Error Rate                0.3%

Intent Distribution (Last Hour)
recommendation           35%
product_question         22%
faq                      18%
order_tracking           12%
chitchat                  8%
escalation                5%

Conversion Funnel (Today)
Visitors -> Chat Opens -> Products Shown -> Add to Cart -> Purchase
1,000,000 -> 52,000 -> 31,000 -> 8,400 -> 4,200

Quality (Last 24h)
Thumbs Up: 65%
Thumbs Down: 8%
No Feedback: 27%
Hallucinations Detected: 12
Escalations: 2,340

Weekly Business Review Metrics

Amazon runs Weekly Business Reviews. MangaAssist should report:

  1. Conversion rate for chat users vs. non-chat users.
  2. Revenue attributed to chat-assisted purchases.
  3. Top user intents to inform catalog and merchandising decisions.
  4. Escalation root causes to identify automation opportunities.
  5. Quality trend for hallucination rate and CSAT over time.
  6. Cost per session for LLM and compute usage.