15. Trade-offs and Challenges
Key Trade-offs
1. Rule-Based vs. LLM-Based Responses
graph LR
subgraph "Rule-Based"
A[Fast<br>< 10ms]
B[Predictable<br>No hallucination]
C[Rigid<br>Cannot handle novel queries]
D[Cheap<br>No LLM cost]
end
subgraph "LLM-Based"
E[Slower<br>1-3 seconds]
F[Flexible<br>Handles more cases]
G[Risk<br>Can hallucinate]
H[Expensive<br>Per-token cost]
end
Decision: Hybrid. Use rules for greetings, simple lookups, and obvious order status questions. Use the LLM for recommendations, complex Q&A, and multi-turn conversations.
2. RAG vs. Direct API-Based Answers
| Approach | Pros | Cons | Best For |
|---|---|---|---|
| RAG | Broad knowledge, flexible | Retrieval can fail, slower | FAQ, policies, editorial content |
| Direct API | Precise, fast, no hallucination | Rigid, needs per-intent coding | Order status, prices, inventory |
Decision: Use both. RAG for knowledge-heavy questions, direct API for structured data lookups.
3. Latency vs. Quality
graph TD
A[User asks question] --> B{Simple or Complex?}
B -->|Simple| C[Fast Path<br>Rule + Template<br>< 200ms]
B -->|Complex| D{How complex?}
D -->|Medium| E[RAG + LLM<br>1-2 seconds]
D -->|High| F[Multi-step reasoning<br>2-3 seconds]
Decision: Target under 1 second for first token, use a smaller model for simple tasks, reserve the best model for truly complex requests, and never let one request exceed 5 seconds.
4. Personalization vs. Privacy
Decision: Prefer privacy-first personalization.
- Authenticated users can use purchase history and browsing data.
- Guest users use current session data only.
- Never display full addresses, payment info, or sensitive data.
- Keep all personalization data encrypted and access-controlled.
5. Cost vs. Accuracy
Decision: Optimize for the sweet spot.
- Use Sonnet-class generation for quality.
- Retrieve the top 3 RAG chunks with reranking.
- Keep 10 turns of memory.
- Cache product data but never cache prices.
6. Automation vs. Human Support
pie title "Target Resolution Mix (V2)"
"Fully Automated" : 70
"Automated + Confirmation" : 15
"Escalated to Human" : 15
Decision: - Fully automated: discovery, recommendations, FAQ, product Q&A, order tracking. - Automated with confirmation: return initiation. - Always human: billing disputes, fraud reports, legal issues, deeply frustrated users.
Key Challenges
Challenge 1: Manga Domain Knowledge
Problem: General-purpose models may not know about specific editions, translations, or niche titles available on Amazon.
Solution: Use RAG indexed with Amazon's actual product descriptions and editorial content.
Challenge 2: Multi-Format Complexity
Problem: One manga series can have many formats, editions, and bundled products.
Solution: Build a series resolver that groups ASINs by series and presents clear comparisons.
Challenge 3: Hallucinated Recommendations
Problem: The LLM might recommend manga that does not exist on Amazon.
Solution: Only feed the LLM real ASINs from the recommendation engine and validate product references after generation.
Challenge 4: Cold Start for New Users
Problem: New users with no browsing or purchase history get generic recommendations.
Solution: Use popularity-based recommendations, ask one clarifying question, and fall back to genre-based suggestions.
Challenge 5: Real-Time Data Consistency
Problem: Price or availability may change between chat and add-to-cart.
Solution: Always fetch prices in real time, warn that prices may change, and re-check critical data at action time.
Challenge 6: Prompt Injection
Problem: Malicious users may try to manipulate the chatbot with crafted messages.
Solution: Use input scanning, prompt isolation, explicit system rules, and output guardrails.
Challenge 7: Measuring True Impact
Problem: It can be hard to know whether the chatbot caused the purchase.
Solution: Use an A/B test, compare conversion and AOV between groups, and keep a small holdout group after launch.