LOCAL PREVIEW View on GitHub

Database Tradeoffs — MangaAssist Chatbot

This folder contains scenario-based deep dives into the database choices made in the MangaAssist architecture, and what concretely happens if you swap each one for a different technology.


Database Map

Role Current Choice Where Used
Conversation Memory DynamoDB Stores each session's turns, summaries, and metadata
Vector Store (RAG) OpenSearch Serverless Embedding search for FAQ, policy, product knowledge
Response Cache ElastiCache Redis Product details, recommendations, promotions, reviews
Analytics Warehouse Redshift (via Kinesis) Event aggregation, latency trends, intent distributions

Why These Four Trade-off Clusters Matter

Cluster 1 — Conversation Memory

Every message the user sends requires reading the last N turns. This is a pure key-value access pattern with a time-bounded write load. The wrong choice here creates either hot-partition failures at scale or overprovisioned infrastructure that costs 10× more than necessary.

Cluster 2 — Vector Store (RAG)

This is the core of what makes the chatbot answer questions accurately instead of hallucinating. The choice affects recall quality, index freshness latency, cost per query, and operational complexity. Getting this wrong either makes retrieval slow (user waits too long) or imprecise (LLM gets bad chunks and hallucinates).

Cluster 3 — Response Cache

The cache absorbs the difference between "sub-100ms product detail fetch" and "450ms catalog service fetch." The wrong choice either wastes Redis cluster cost on rarely-hit keys or causes cache thrash during a flash sale that brings down the product catalog.

Cluster 4 — Analytics Warehouse

This is the slowest-moving choice: it affects how fast the team can detect a regression (LLM giving wrong manga recommendations), how much it costs to run retrospective queries, and whether the data science team can iterate on the golden dataset for offline testing.


How to Read Each Scenario File

Each file follows this structure per alternative:

  1. What Changes — the concrete swap you're making
  2. Best Case — when this alternative wins
  3. Failure Scenario — a concrete production incident that this choice causes
  4. Grilling Questions — interview-level follow-up to force you to think harder
  5. Decision Heuristic — the one-sentence rule for when to pick this option

Files in This Folder

File Covers
01-conversation-memory.md DynamoDB vs Redis vs PostgreSQL vs MongoDB vs Aurora
02-vector-store.md OpenSearch vs Pinecone vs pgvector vs Weaviate vs in-memory FAISS
03-cache-layer.md Redis vs Memcached vs DynamoDB DAX vs in-process cache vs no cache
04-analytics-warehouse.md Redshift vs Athena vs ClickHouse vs DynamoDB Streams vs real-time OpenSearch