Database Tradeoffs — MangaAssist Chatbot

This folder contains scenario-based deep dives into the database choices made in the MangaAssist architecture, and what concretely happens if you swap each one for a different technology.

Database Map

Role	Current Choice	Where Used
Conversation Memory	DynamoDB	Stores each session's turns, summaries, and metadata
Vector Store (RAG)	OpenSearch Serverless	Embedding search for FAQ, policy, product knowledge
Response Cache	ElastiCache Redis	Product details, recommendations, promotions, reviews
Analytics Warehouse	Redshift (via Kinesis)	Event aggregation, latency trends, intent distributions

Why These Four Trade-off Clusters Matter

Cluster 1 — Conversation Memory

Every message the user sends requires reading the last N turns. This is a pure key-value access pattern with a time-bounded write load. The wrong choice here creates either hot-partition failures at scale or overprovisioned infrastructure that costs 10× more than necessary.

Cluster 2 — Vector Store (RAG)

This is the core of what makes the chatbot answer questions accurately instead of hallucinating. The choice affects recall quality, index freshness latency, cost per query, and operational complexity. Getting this wrong either makes retrieval slow (user waits too long) or imprecise (LLM gets bad chunks and hallucinates).

Cluster 3 — Response Cache

The cache absorbs the difference between "sub-100ms product detail fetch" and "450ms catalog service fetch." The wrong choice either wastes Redis cluster cost on rarely-hit keys or causes cache thrash during a flash sale that brings down the product catalog.

Cluster 4 — Analytics Warehouse

This is the slowest-moving choice: it affects how fast the team can detect a regression (LLM giving wrong manga recommendations), how much it costs to run retrospective queries, and whether the data science team can iterate on the golden dataset for offline testing.

How to Read Each Scenario File

Each file follows this structure per alternative:

What Changes — the concrete swap you're making
Best Case — when this alternative wins
Failure Scenario — a concrete production incident that this choice causes
Grilling Questions — interview-level follow-up to force you to think harder
Decision Heuristic — the one-sentence rule for when to pick this option

Files in This Folder

File	Covers
`01-conversation-memory.md`	DynamoDB vs Redis vs PostgreSQL vs MongoDB vs Aurora
`02-vector-store.md`	OpenSearch vs Pinecone vs pgvector vs Weaviate vs in-memory FAISS
`03-cache-layer.md`	Redis vs Memcached vs DynamoDB DAX vs in-process cache vs no cache
`04-analytics-warehouse.md`	Redshift vs Athena vs ClickHouse vs DynamoDB Streams vs real-time OpenSearch