DynamoDB Deep Dive - Basics, Harder Questions, Pitfalls, and Scale Notes
This document covers DynamoDB from multiple viewpoints: a developer who needs working code, an architect who needs capacity reasoning, an SRE who needs operational guidance, and an interview candidate who needs crisp, structured answers.
DynamoDB Basics
DynamoDB Basics
| Concept | What It Means | Why It Matters |
|---|---|---|
| Table | A collection of items | Similar to a top-level container |
| Item | One record in the table | Similar to a row, but schema-flexible |
| Attribute | A field inside an item | Similar to a column value |
| Partition Key | The value that decides physical data placement | Good key choice determines scale |
| Sort Key | Secondary part of the primary key used for ordering within a partition | Critical for timelines and range reads |
| Primary Key | Partition key alone, or partition key plus sort key | Defines uniqueness and access pattern |
| GSI | Global Secondary Index | Lets you query the same data from another angle |
| LSI | Local Secondary Index | Alternate sort key on the same partition key; must be defined at table creation |
| TTL | Time to Live expiry attribute | Useful for short-lived data, but deletion is asynchronous |
| Streams | Change log of item mutations | Useful for async workflows and event-driven processing |
| Conditional Write | Write only if a condition is true | Prevents duplicate or conflicting updates |
| Transaction | Multi-item atomic read/write operation | Stronger guarantee, higher cost, more limits |
| Strongly Consistent Read | Read the latest committed value from the base table | Lower staleness, higher cost and lower throughput |
| Eventually Consistent Read | Read a value that may lag briefly | Cheaper and common for high-scale reads |
| DAX | DynamoDB Accelerator | Read cache for microsecond access on hot keys |
How DynamoDB Is Used in MangaAssist
The project uses a single table for conversation memory.
Core Data Model
PK = SESSION#<session_id>SK = META | TURN#<timestamp> | SUMMARY#<window_id>
This gives one partition per session and an ordered timeline inside that session.
Main Access Patterns in This Project
| Access Pattern | Operation | Key Pattern | Why It Exists |
|---|---|---|---|
| Create session | PutItem |
PK=SESSION#id, SK=META |
Bootstrap chat state |
| Load recent history | Query |
PK=SESSION#id, reverse sort |
Build prompt context quickly |
| Append new turn | PutItem |
SK=TURN#timestamp |
Persist user and assistant messages |
| Update session metadata | UpdateItem |
SK=META |
Track turn count, page context, intent |
| Store summary | PutItem |
SK=SUMMARY#window_id |
Compress older turns |
| Resume by customer | Query on GSI |
GSI1PK=customer_id |
Authenticated reconnect |
| Build human handoff | Query |
Session partition | Load summary plus recent turns |
Why the Model Is Per-Turn, Not One Big Transcript Item
This project intentionally avoids storing the full conversation in one item because:
- DynamoDB has a 400 KB item limit
- Each new message would rewrite a larger object
- Concurrent updates are harder
- Retry logic is less clean
- Fetching the latest few turns becomes less efficient
Code Examples (Boto3 / Python)
Developer's perspective: Seeing real code makes abstract DynamoDB concepts concrete. These are patterns you would actually write and defend in a code review.
Create a New Session
import boto3
import time
dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
table = dynamodb.Table("manga-assist-sessions")
def create_session(session_id: str, customer_id: str | None, page_context: dict) -> None:
ttl = int(time.time()) + 86400 # 24-hour expiry
table.put_item(
Item={
"PK": f"SESSION#{session_id}",
"SK": "META",
"session_id": session_id,
"customer_id": customer_id,
"GSI1PK": customer_id, # Only present for authenticated users
"GSI1SK": int(time.time() * 1000),
"turn_count": 0,
"page_context": page_context,
"created_at": int(time.time() * 1000),
"updated_at": int(time.time() * 1000),
"ttl": ttl,
},
# Prevent overwriting a session that already exists (e.g., Lambda retry)
ConditionExpression="attribute_not_exists(PK)",
)
Note the
ConditionExpression. Without it, a Lambda retry after a transient network failure would silently resetturn_countto 0.
Append a Turn (Idempotent Write)
import gzip
import json
def append_turn(
session_id: str,
turn_index: int,
role: str,
content: str,
intent: str,
response_id: str,
token_count: int,
) -> None:
ts = int(time.time() * 1000)
compressed = gzip.compress(content.encode("utf-8"))
table.put_item(
Item={
"PK": f"SESSION#{session_id}",
"SK": f"TURN#{ts:020d}",
"session_id": session_id,
"turn_index": turn_index,
"role": role,
"content_compressed": compressed,
"intent": intent,
"response_id": response_id,
"token_count": token_count,
"ttl": int(time.time()) + 86400,
},
# Guard: if this response_id was already written, do not duplicate
ConditionExpression="attribute_not_exists(response_id)",
)
response_idis the idempotency key. If the caller retries with the sameresponse_id, the condition fails silently and the duplicate is prevented.
Load Context (Latest Turns + Summary)
from boto3.dynamodb.conditions import Key
import gzip
def load_context(session_id: str, max_turns: int = 10) -> dict:
pk = f"SESSION#{session_id}"
# Load META + latest TURN items + latest SUMMARY in one query
# DynamoDB returns items in SK order; ScanIndexForward=False gives newest first
response = table.query(
KeyConditionExpression=Key("PK").eq(pk),
ScanIndexForward=False, # newest first
Limit=max_turns + 5, # buffer for META and SUMMARY items
)
meta = None
turns = []
latest_summary = None
for item in response["Items"]:
sk = item["SK"]
if sk == "META":
meta = item
elif sk.startswith("TURN#"):
if len(turns) < max_turns:
text = gzip.decompress(item["content_compressed"].value).decode("utf-8")
turns.append({"role": item["role"], "content": text, "index": item["turn_index"]})
elif sk.startswith("SUMMARY#") and latest_summary is None:
latest_summary = item.get("summary_text")
# Turns came back newest-first; reverse for chronological prompt assembly
turns.reverse()
return {"meta": meta, "turns": turns, "summary": latest_summary}
Common mistake: Forgetting
ScanIndexForward=Falseand then reversing in Python. ForgettingLimitand accidentally scanning the entire session. Always set aLimitand handleLastEvaluatedKeyfor pagination defensively.
Increment Turn Count on META
def increment_turn_count(session_id: str) -> None:
table.update_item(
Key={"PK": f"SESSION#{session_id}", "SK": "META"},
UpdateExpression="SET turn_count = turn_count + :one, updated_at = :now",
ExpressionAttributeValues={":one": 1, ":now": int(time.time() * 1000)},
)
Retrieve Recent Sessions by Customer (via GSI)
def get_recent_sessions(customer_id: str, limit: int = 5) -> list:
response = table.query(
IndexName="GSI1-customer-sessions",
KeyConditionExpression=Key("GSI1PK").eq(customer_id),
ScanIndexForward=False, # most recent first
Limit=limit,
ProjectionExpression="session_id, updated_at, turn_count",
)
return response["Items"]
Handle Pagination Correctly
def load_all_turns(session_id: str) -> list:
"""Used for human handoff or full context export. Not for normal chat."""
pk = f"SESSION#{session_id}"
turns = []
last_key = None
while True:
kwargs = {
"KeyConditionExpression": Key("PK").eq(pk) & Key("SK").begins_with("TURN#"),
"ScanIndexForward": True,
}
if last_key:
kwargs["ExclusiveStartKey"] = last_key
response = table.query(**kwargs)
turns.extend(response["Items"])
last_key = response.get("LastEvaluatedKey")
if not last_key:
break
return turns
Always handle
LastEvaluatedKey. Failing to do so means you silently return partial data when a session has more than 1 MB of turns, which a long enough conversation will eventually produce.
Different Ways to Use DynamoDB
DynamoDB is not only a key-value store. It supports multiple modeling styles depending on the access pattern.
1. Simple Key-Value Store
Use when:
- You look up one record by one key
Examples:
- Session token lookup
- Feature flag by key
- Cached prompt version by ID
2. Time-Ordered Timeline Store
Use when:
- You append events and read them in order
Examples:
- Chat turns
- Audit logs by entity
- User activity stream
This is the main pattern used in MangaAssist conversation memory.
3. Single-Table Multi-Entity Design
Use when:
- Related entities need to be queried together efficiently
Examples:
META,TURN,SUMMARY, and handoff state inside one table- Order plus shipment events in one access-oriented model
4. Materialized View with GSIs
Use when:
- The same data must be queried by multiple access patterns
Examples:
- Find sessions by
customer_id - Find active jobs by status
- Find latest events by tenant
5. Event-Driven System with Streams
Use when:
- A write should trigger asynchronous processing
Examples:
- Trigger summarization after turn count crosses a threshold
- Push analytics events
- Start moderation or audit workflows
6. Global Multi-Region State
Use when:
- The same table must exist across regions
Examples:
- Active-active session state
- Disaster recovery with faster failover
Be careful: write conflicts need explicit thinking.
7. Cache-Accelerated Read Path
Use when:
- Key patterns are hot and repeated
Examples:
- DAX for hot conversation reads
- ElastiCache in front of downstream services
In this project, DAX is a possible optimization, not the source of truth.
Medium and Hard DynamoDB Questions
Interview candidate's perspective: The goal is not just to answer the question but to show depth by explaining the tradeoff, not just the conclusion. Good answers follow a structure: what the problem is → what options exist → what you chose and why.
Q1. Why did we choose separate TURN items instead of one session document?
Short answer: Chat history grows every turn. Storing it as one document creates a write amplification problem, violates the 400 KB item limit over time, and makes retry safety much harder.
Deeper answer for interviews: This is fundamentally the difference between an event-sourcing pattern (append small items, reconstruct state from events) and a document pattern (rewrite the full document every change). DynamoDB's item size limit and pricing model both push you toward append-heavy small writes. The per-turn model also gives you precise granularity for retrieval — you can load just the last 10 turns without reading older history at all.
Q2. What is the risk of using session_id as the partition key?
Short: Hot partition risk if one session produces disproportionate traffic.
Deeper: In practice, a single chat session generates modest, bursty traffic over a short window, not sustained high throughput. The real hot-key risk in this system is not one session, but rather a shared key used across many requests (like a shared test account or a bot hitting the same session). Mitigate by monitoring ThrottledRequests per partition and enforcing session isolation. DynamoDB's adaptive capacity automatically reroutes hot partitions within a table, so moderate hotness is handled without manual intervention.
Q3. When would you use strong consistency in DynamoDB?
Short: Only when stale reads can cause user-visible bugs.
Deeper: In this project, most context reads are eventually consistent. The risk of stale data is low because each session has one active writer (the orchestrator instance handling that turn). Strong consistency makes sense for: (1) read-after-write flows where a Lambda writes metadata and then immediately reads it back within the same request, (2) any uniqueness check before a write where a racing write is catastrophic. Using strong reads on GSIs is not possible — GSIs are always eventually consistent, so designs that require strong consistency must query the base table.
Q4. Why are GSIs powerful but expensive?
Short: Every write to an indexed attribute triggers a write to the index. Two attributes = roughly 2x WCU for those attributes.
Deeper: GSI cost has three components: writes (every change to an indexed attribute writes to the GSI), reads (querying the GSI consumes RCU from the index's own capacity), and storage (the projected attributes are duplicated). If you project all attributes (ALL), storage doubles. A GSI should exist only when there is a real, frequently executed access pattern that requires it. Adding one speculatively "in case we need it" is a common and expensive mistake.
Q5. Why is TTL not enough for strict compliance deletion?
Short: TTL deletion is asynchronous. An item can remain readable for up to 48 hours after the TTL timestamp.
Deeper: For GDPR right-to-delete or CCPA deletion requests, the SLA is usually 30 days from the request — but the problem is proving deletion happened. TTL does not give you a deletion timestamp you can log. The correct approach is: (1) perform an explicit DeleteItem for all items belonging to the customer, (2) write a deletion receipt to an audit log, (3) let TTL clean up any remnants. The TTL is lifeguard hygiene, not the SLA mechanism.
Q6. How would you prevent duplicate turn writes during retries?
Short: Use an idempotency key (response_id) and guard with attribute_not_exists(response_id).
Deeper: The risk is that a Lambda retries a write after a transient timeout. The first write may have succeeded, and the second write would create a duplicate turn. DynamoDB's conditional writes solve this cleanly: write the item with ConditionExpression="attribute_not_exists(response_id)". If the first write succeeded, the second write fails with a ConditionalCheckFailedException, which you suppress. If the first write truly failed, the second write succeeds. This pattern is the DynamoDB equivalent of database upsert idempotency.
Q7. When do you use a transaction instead of conditional writes?
Short: Use transactions when multiple items must succeed or fail atomically. Use conditional writes when one item needs optimistic concurrency.
Deeper: In this project, most writes are independent per-turn items. A conditional write on the TURN item is sufficient. Transactions become necessary if you need to: (1) simultaneously create the META item and write the first TURN item as an atomic unit, (2) atomically update two sessions during a merge or delegate scenario. Transactions cost 2x per item (they use 2 read or write units per item), have a 100-item limit per transaction, and add latency. Use them precisely, not by default.
Q8. What happens if you add too many GSIs later?
Short: Write cost scales linearly with the number of GSIs that index a written attribute. Storage cost grows. Table complexity increases.
Deeper: GSI writes are not free even if no one is reading from the GSI yet. A table with 5 GSIs where an attribute is in 3 of them will incur roughly 3x–4x WCU per write to that attribute versus no GSI. Backfilling a new GSI on a large table can also run for hours and consume significant capacity. The lesson: design your access patterns first, then add only the indexes those patterns require. Avoid adding convenience indexes "in case we want that someday."
Q9. How do Global Tables resolve conflicts?
Short: Last-writer-wins at the attribute level, using DynamoDB's internal wall-clock timestamps.
Deeper: If two regions write the same item concurrently, DynamoDB does not merge them — the later write (by timestamp) overwrites the earlier one. This is safe as long as your application defines clear write ownership per region (e.g., session is pinned to the region where it was created). If you allow writes from two regions to the same session simultaneously, you can get silently lost turns. Safer patterns: (1) regional session pinning (session_id encodes the originating region), (2) conditional writes using a version attribute to detect conflicts early. Multi-region chat memory is an advanced and tricky problem.
Q10. Why is Scan usually a smell in DynamoDB?
Short: It reads every partition regardless of the query predicate, making it unpredictably expensive at scale.
Deeper: At 100 million items, a Scan reads all 100 million items even if only 10 match. In provisioned mode, this consumes your entire read capacity and throttles real users. In on-demand mode, the cost is proportional to all items scanned. Scan also has no performance isolation — it runs on the same infrastructure as your critical reads. The only safe uses of Scan are: (1) one-time operational scripts on small test tables, (2) full table export via ExportTableToPointInTime (which runs on snapshots, not live traffic). If your application code contains Scan, treat it as a bug.
Q11. Why can filter expressions be misleading?
Short: Filtering happens after reading. You pay for all items read, not just items returned.
Deeper: A Query that reads 100 items, then applies a filter that keeps 5, still charges for 100 RCU. Developers often add filters thinking they are optimizing cost and are surprised when cost does not drop. The solution is to model the access pattern into the key structure (partition key, sort key, or GSI) so the query naturally returns only what you need. Reserve filter expressions for small post-processing (e.g., removing items where a soft-delete flag is set), not as a substitute for proper key modeling.
Q12. What is the biggest mindset shift from relational databases to DynamoDB?
Short: You model for access patterns first, not for data normalization first.
Deeper: In SQL, you start by modeling entities and relationships, then trust the query planner to traverse them. In DynamoDB, the query planner cannot join across partitions — there is no join. You must shape your data so that every query you will ever run is either a direct key lookup or a range scan within a single partition. This means you sometimes duplicate data (denormalization) or embed differently oriented copies of the same data. The initial design burden is higher, but read performance is far more predictable and scalable as a result.
Schema Evolution Strategy
Architect's perspective: DynamoDB is schema-flexible, but changing your access patterns later is harder than in SQL. Know the playbook before you need it.
Adding a New Attribute to Existing Items
DynamoDB is schema-flexible. New attributes can be written to new items immediately. For existing items, you have two options:
- Lazy migration: Write the new attribute on the next update to each item. Accept that old items lack it. Read code handles the missing field gracefully.
- Backfill: Scan the table (off-peak hours, dedicated capacity), add the attribute, and write it back. Use a feature flag to switch behavior after the backfill completes.
For this project, lazy migration is preferred for non-critical attributes (e.g., adding intent to older turns). Backfills are reserved for attributes that new business logic depends on.
Adding a New GSI
New GSIs can be added to an existing table at any time. DynamoDB will backfill the index automatically. Key considerations:
- The index is eventually consistent with the live table during backfill
- Do not read from the new GSI in production until backfill completes (check
IndexStatusin the AWS Console or CLI) - Backfill duration depends on table size; for large tables, alert on
IndexStatustransitions, not on the command completion
Changing a Partition Key (The Hard Case)
You cannot change the partition key of an existing table. Options:
- Dual-write: Create the new table, write all new data to both tables, backfill historical data, then cut over reads.
- Shadow table: Run both tables in parallel for a period, then decommission the old one.
- Export/import: Use
ExportTableToPointInTimeto get a snapshot, transform it, and import it into the new table.
For this project, the SESSION#<id> key structure is deliberately designed to be stable. If the system ever needed to re-key (e.g., by customer_id instead of session_id), it would require a dual-write period and a coordinated cutover — a significant operational effort.
Testing DynamoDB-Dependent Code
Developer's perspective: Tests that hit live AWS tables are slow, expensive, and flaky. Know the alternatives.
Option 1: DynamoDB Local (for unit and integration tests)
Amazon provides DynamoDB Local — a JAR file that runs an in-memory DynamoDB instance locally.
# Start DynamoDB Local via Docker
docker run -p 8000:8000 amazon/dynamodb-local
In your test setup, point the boto3 client at http://localhost:8000 instead of the real endpoint:
dynamodb = boto3.resource(
"dynamodb",
region_name="us-east-1",
endpoint_url="http://localhost:8000",
aws_access_key_id="test",
aws_secret_access_key="test",
)
Pros: Fast, free, isolated, reproducible. Cons: Does not emulate all behaviors exactly (adaptive capacity, GSI eventual consistency timing, Streams).
Option 2: moto (Python mock)
moto is a Python library that intercepts boto3 calls and mocks DynamoDB in-process:
import boto3
from moto import mock_aws
import pytest
@mock_aws
def test_create_session():
# moto intercepts all boto3 calls within this context
dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
table = dynamodb.create_table(
TableName="manga-assist-sessions",
KeySchema=[
{"AttributeName": "PK", "KeyType": "HASH"},
{"AttributeName": "SK", "KeyType": "RANGE"},
],
AttributeDefinitions=[
{"AttributeName": "PK", "AttributeType": "S"},
{"AttributeName": "SK", "AttributeType": "S"},
],
BillingMode="PAY_PER_REQUEST",
)
table.wait_until_exists()
create_session("sess_123", "cust_456", {"page": "home"})
item = table.get_item(Key={"PK": "SESSION#sess_123", "SK": "META"})["Item"]
assert item["customer_id"] == "cust_456"
Pros: No external process needed, very fast, works in CI without AWS credentials. Cons: moto coverage is generally good but not 100% of DynamoDB's API surface.
Option 3: Dedicated staging table in AWS
For integration tests that must run against real DynamoDB (to catch behaviors moto cannot simulate):
- Create a separate table with the prefix
manga-assist-sessions-stg - Use a separate IAM role with access only to the staging table
- Clean up test data after each test run using the
session_idprefix and batch deletes - Never share staging table state across parallel test runs — use unique
session_idprefixes per test run
Capacity Planning Formulas
Architect's perspective: These are the formulas you use to size the table before go-live and to interpret CloudWatch metrics after go-live.
WCU Formula
$$WCU = \lceil item_size_KB \rceil \times writes_per_second$$
For this project at peak: - Turn writes: item size ≈ 1 KB → 1 WCU per write - 33,000 turn writes/sec → 33,000 WCU - On-demand billing: $1.25 per million WCU (us-east-1) - At 33,000 WPS for 3,600 seconds/hour: 33,000 × 3,600 = ~119M WCU/hour → ~$148/hour peak
RCU Formula
$$RCU_{eventually_consistent} = \lceil item_size_KB / 4 \rceil \times reads_per_second \times 0.5$$
For this project: - Loading 10 turns per request, each 1 KB → 10 RCU at strong consistency, 5 RCU at eventual - 33,000 context loads/sec → 165,000 RCU (eventually consistent) - On-demand billing: $0.25 per million RCU (us-east-1) - At 165,000 RPS for 3,600 seconds: 594M RCU/hour → ~$149/hour peak
Provisioned Capacity Sizing With Buffer
If switching from on-demand to provisioned for cost savings at steady-state:
$$Provisioned_WCU = peak_WPS \times 1.2 (20\% buffer)$$
$$Provisioned_RCU = peak_RPS \times 1.2$$
Set auto-scaling target utilization at 70% so headroom exists before throttling.
Data Volume Formula
$$Storage_GB = concurrent_sessions \times avg_session_size_KB / 1,024,000$$
At 50,000 concurrent sessions × 2 KB average: ≈ 0.1 GB active data. Trivial. Even at 10x growth, storage is not the DynamoDB cost driver — throughput is.
Tricky Things to Be Careful About in DynamoDB
| Tricky Area | Why It Is Tricky | What to Do |
|---|---|---|
| Hot partitions | Uneven traffic on one partition key can throttle one slice of the table | Choose high-cardinality keys; shard only if a real hot-key pattern appears |
| 400 KB item limit | Large documents fail hard as they grow | Split data into smaller items and summarize old content |
| TTL behavior | Expiry is not immediate deletion | Enforce expiry in application logic too |
| GSIs | Extra indexes multiply write cost | Add indexes only for real reads you need |
| Strong consistency | More expensive and not available on GSIs | Use it only where the user-visible correctness gain matters |
| Pagination | Queries return up to 1 MB per page | Always handle LastEvaluatedKey correctly |
| Filter expressions | They do not reduce read cost | Prefer better keys over post-filtering |
| Transactions | Stronger guarantees, but more cost and limits | Use for real atomicity needs only |
| Global Tables | Conflict resolution is subtle | Prefer clear regional write ownership |
| DAX | Very fast, but cached reads can hide update timing assumptions | Understand cache invalidation and consistency tradeoffs |
| Backfills and migrations | Re-keying data is harder than in relational systems | Plan migrations early and keep models access-driven |
| Sparse indexes | Missing attributes mean items may not appear in the GSI | Be explicit about which entities are meant to project |
Challenges of DynamoDB at Scale in This Project
1. Read latency spikes on conversation memory
Problem:
- Memory loads sit on the chat critical path
- Tail latency hurts the first-token budget
Why it matters here:
- The chatbot target is a useful response in under about 3 seconds
- Even a 100 to 200 ms memory spike is visible when combined with LLM and retrieval latency
Mitigation:
- Keep reads small and access-pattern driven
- Query only the latest turns plus summary
- Consider DAX or a small hot read cache for repeated context loads
2. Long conversations growing without bound
Problem:
- More turns means more data and more prompt tokens
Why it matters here:
- The assistant must remember enough context but not blow the token budget
Mitigation:
- Summarize old windows
- Keep only recent turns in full form
- Store structured metadata instead of relying only on raw text
3. Retry safety during streaming and partial failures
Problem:
- A response can be generated, partially streamed, retried, and accidentally double-written
Why it matters here:
- Duplicate memory corrupts later context and confuses handoff
Mitigation:
- Use idempotency keys
- Guard writes with condition expressions
- Separate response delivery from persistence retry paths
4. Customer lookup via GSI becoming noisier at scale
Problem:
- Querying by
customer_idis useful, but GSIs add cost and can become a hot path if overused
Why it matters here:
- Resume and support flows need it, but every turn does not
Mitigation:
- Keep the GSI narrow and purpose-specific
- Use it for reconnect or escalation, not normal per-message reads
5. Burst traffic during major events
Problem:
- Session creation and message volume can spike sharply
Why it matters here:
- This project expects large concurrency swings
Mitigation:
- Use on-demand or carefully managed auto-scaling
- Load test around burst patterns, not only average traffic
- Watch throttles, adaptive capacity behavior, and p99 latency
6. TTL lag versus privacy expectations
Problem:
- Users may assume expired means immediately gone
Why it matters here:
- This project is privacy-sensitive and session-scoped by design
Mitigation:
- Treat expired items as invalid in application logic immediately
- Use explicit deletes for stricter flows
7. Multi-region write conflicts
Problem:
- If two regions write the same session, history ordering can become messy
Why it matters here:
- Chat sessions are sequence-sensitive
Mitigation:
- Prefer single active writer per session
- If multi-region is needed, define regional ownership or session pinning
Challenges of DynamoDB at Scale in General
Access-pattern rigidity
DynamoDB rewards teams that know their reads and writes upfront. If the product keeps changing query patterns every month, schema evolution is harder than in SQL systems.
Denormalization pressure
You often duplicate data to satisfy access patterns. That improves performance but increases coordination and write complexity.
Secondary index cost explosion
Teams often add GSIs casually, then discover write cost and storage cost jumped much faster than expected.
Data migration difficulty
Changing partition keys or reshaping major entities usually means backfills, dual writes, or shadow tables.
Operational blind spots
A table can look healthy in average metrics while one hot key is degrading real users. DynamoDB requires careful partition-aware monitoring.
Consistency misunderstandings
Teams new to DynamoDB often assume reads are immediately current everywhere. That assumption breaks with GSIs, Global Tables, and cached layers.
TTL misunderstanding
TTL is great for lifecycle hygiene but not a precise deletion SLA.
Practical Guidance for This Project
For MangaAssist, the safest DynamoDB rules are:
- Model around session access patterns, not around generic "chat transcript" storage
- Keep turns as separate items
- Keep GSIs minimal
- Use summaries to cap growth
- Make writes idempotent
- Treat TTL as eventual cleanup, not immediate deletion
- Keep DynamoDB as the durable source of truth even if DAX or Redis is added later
Quick Reference: Common Interview Questions and Crisp Answers
| Question | Best Opening Line |
|---|---|
| Why DynamoDB over PostgreSQL for this? | "Chat memory has no joins and extremely high concurrency — DynamoDB fits that shape better than a relational system." |
| What is a hot partition? | "When one partition key gets disproportionate traffic and approaches the per-partition throughput ceiling." |
| Why per-turn items instead of one document? | "The document grows every message, rewrites amplify, and retries become unsafe. Per-turn items are small, append-only, and independently idempotent." |
| How do you prevent duplicate writes? | "Idempotency key in the item plus a ConditionExpression: attribute_not_exists prevents re-insertion." |
| What breaks if DynamoDB is slow? | "Context assembly for the prompt is delayed, which inflates first-token latency. We mitigate with graceful degradation to zero-turn context." |
| How does TTL work exactly? | "You set a unix epoch attribute. DynamoDB deletes the item asynchronously after that time, but not immediately — always enforce expiry in app logic too." |
| When would you add DAX? | "When profiling shows DynamoDB reads are the bottleneck AND the hot reads are deterministically repeatable, not unique per session." |
| Why is Scan dangerous? | "It reads every item in the table regardless of predicate; it costs proportional to table size, not result size." |
One-Line Summary
DynamoDB works very well for this project because chat memory is a short-lived, ordered, high-scale session-state problem, but it still requires careful thinking around hot keys, TTL semantics, indexing cost, retries, multi-region behavior — and it rewards teams that learn to think in access patterns before they think in data models.