API Design and Testing — MangaAssist
This folder documents the full API surface of MangaAssist: what I built, how I tested it, how I hardened it for Amazon-scale traffic, how the design fits into HLD/LLD, and how each major decision was forced or shaped by competing stakeholder perspectives (Backend, SRE, Security, ML, Product, Frontend, Cost, Legal).
Written in first-person narrative with interview Q&A and multi-round grilling chains for interview prep.
Reading Order
The files build on each other. Read in order if you're new; jump to a specific file if you have a focused question.
| # | File | Read this when you want… | Length |
|---|---|---|---|
| 1 | 00-hld-lld-architecture.md | The architectural foundation — HLD (5 layers, AWS topology, NFRs, capacity, failure domains), LLD (sequence diagrams, state machines, schemas, observability), and the Decision Lens (10 baseline decisions through 8 stakeholder perspectives) | Long, dense |
| 2 | 01-api-types-overview.md | The 6 API surfaces — WebSocket streaming, external REST, internal downstream, ML inference, vector search, guardrails. Each section has LLD detail + a Design Lens sub-section showing which stakeholder forced which choice. Closes with Cross-API design tensions | Medium |
| 3 | 02-api-testing-strategy.md | How every layer was tested — unit (~400), integration (~80), contract (~50), E2E (~30 scenarios), LLM quality eval (500-prompt golden set), security (~60). Includes multi-round grilling on judge reliability and contract-test enforcement | Medium |
| 4 | 03-scale-testing-scenarios.md | The 7 real scale scenarios I hit — Prime Day spike, Bedrock throttling, SageMaker cold start, DynamoDB thundering herd, WebSocket ceiling, cache stampede, multi-model P99 spike. Each: exact problem, diagnosis, fix, measured result | Medium |
| 5 | 04-interview-qa-deep-dive.md | 21 deep interview questions with Easy → Medium → Hard → Killer Follow-up escalation. Architecture, testing, scale, cross-cutting concerns, scenario-based "what would you do?" questions | Long |
| 6 | 04-offline-testing-quality-strategies.md | Deep-dive on offline testing: 5 pillars (golden dataset, intent classification, RAG validation, hallucination, adversarial), offline-online correlation analysis, 5 multi-round interview grilling chains | Long, dense |
| 7 | 05-grilling-sessions.md | 6 grilling rounds with escalating follow-ups designed to expose shallow answers — streaming vs. guardrails tension, golden-dataset fragility, DynamoDB hotspot debugging, cost justification, fraudulent listings, mobile reconnection. Plus a 10-question lightning round | Medium |
| 8 | 06-decision-records-perspectives.md | Operational ADRs — the decisions forced by incidents, not designed up-front. 8 ADRs (price hallucination, DAX, shadow mode, inline guardrails, multi-region, summarizer threshold, trust score, hallucination as canary) each viewed through the 8-lens framework with explicit Forces / Decision / Alternatives / Consequences / Revisit triggers | Long |
Three Lenses Across the Folder
If you're using this for a specific purpose, here's the cross-cutting view:
Lens A: Architecture (HLD/LLD focus)
- Start: 00-hld-lld-architecture.md (Parts 1–2)
- Then: 01-api-types-overview.md for API-level LLD detail
- Cross-reference: 03-scale-testing-scenarios.md for how the design held up under load
Lens B: Testing & Quality
- Start: 02-api-testing-strategy.md
- Deep: 04-offline-testing-quality-strategies.md
- Hardest follow-ups: 04-interview-qa-deep-dive.md Sections 2 & 3, plus 05-grilling-sessions.md Round 2
Lens C: Decisions & Trade-offs (group of people thinking)
- Foundational decisions: 00-hld-lld-architecture.md Part 3 (D-1 through D-10)
- Operational decisions: 06-decision-records-perspectives.md (ADR-001 through ADR-008)
- Tested in interview format: 04-interview-qa-deep-dive.md and 05-grilling-sessions.md
Why This Matters
MangaAssist is not a single REST API. It is an orchestration system that:
- Accepts traffic over WebSocket (streaming) and REST (fallback)
- Fans out to 9+ downstream internal services in parallel
- Invokes 4 ML models in sequence per request
- Runs a 6-stage guardrails pipeline on every LLM response
- Operates across 2 AWS regions in active-active configuration
- Crosses ownership boundaries with 6+ teams (Catalog, Orders, Returns, Promotions, Shipping, Reviews, Trust & Safety)
Testing a system like this requires multiple strategies working together — a unit test cannot validate that streaming responses arrive correctly; a load test cannot catch a broken JSON contract with the Orders team. Designing it requires aligning multiple stakeholders — the SRE's priority is not the same as the ML engineer's, and the Legal lens often forces architectural choices that no other lens would.
This folder captures both: the engineering depth (what was built and how) and the social depth (how the design cleared a room of people with competing priorities).
Conventions Used in This Folder
- First-person narrative: "I built…" — these are my decisions and my reasoning, written for both interviews and engineering reference.
- Easy → Medium → Hard → Killer Follow-up: Interview Q&A scales in depth. Don't stop at Easy.
- Design Lens: A table or sub-section that names which stakeholder lens forced which choice, with one-line positions per lens.
- Decision Records (ADR-NNN): Operational decisions with Context / Forces / Decision / Alternatives / Consequences / Revisit-triggers.
- HLD vs. LLD: HLD is what and where; LLD is how with concrete schemas, sequence diagrams, state machines, indexes.
- Cross-references: Heavy use of links between files — designed to be read non-linearly.