Scenario 1 — Multi-Stage Docker Builds For The MangaAssist Orchestrator
User Story
As a platform engineer on MangaAssist, I wanted the orchestrator and supporting services to run in lean, reproducible containers on ECS Fargate so that deployments stayed fast, secure, and operationally simple across the manga recommendation and chat pipeline.
What We Actually Did
- Used Docker multi-stage builds for all production services.
- Pushed images to ECR with vulnerability scanning enabled.
- Ran steady baseline traffic (WebSocket sessions, streaming chat, per-user recommendation calls) on ECS Fargate.
- Used Lambda for sudden overflow — traffic spikes when a popular manga title trended — instead of making the container platform absorb every burst alone.
Why This Was The Right Docker Story For MangaAssist
MangaAssist was not a batch API. It had:
- WebSocket connections for streaming manga chat responses
- Per-container L1 caching for recent user preference data
- Streaming token responses to the browser
- A predictable daily baseline of manga browsing traffic with sharp spikes during new chapter drops
That profile made long-lived containers a better steady-state home than Lambda-only, while still letting Lambda absorb sharp spikes without overprovisioning Fargate.
Deep-Dive Questions And Answers
Q1. Why did you use Docker on ECS Fargate instead of EC2 or EKS? Fargate gave us the benefits of containers without cluster management. EKS was explicitly evaluated and ruled out as overkill for our service count. EC2 would have added patching, AMI management, and capacity management without giving us a product advantage on a chatbot workload.
Q2. Why did multi-stage Docker builds matter here? They kept build tooling out of the runtime image. That reduced image size, reduced pull time, reduced attack surface, and made startup faster. In a service that autoscales during chapter-drop traffic spikes, smaller runtime images directly improve deployment and recovery time.
Q3. What belongs in the runtime stage and what does not? Runtime stage: app code, runtime dependencies, entrypoint, and minimal OS packages. It should not contain compilers, test tools, lint tools, source caches, or manga-ML training-only libraries. That separation is the entire point of the multi-stage pattern.
Q4. Why not run the MangaAssist orchestrator only on Lambda if Lambda was already in the design? Because the operating model evolved to hybrid compute. Fargate handled the predictable baseline more efficiently, while Lambda absorbed bursts. That kept steady-state cost and runtime behavior predictable, but still scaled suddenly when a new chapter dropped and traffic spiked 10x in minutes.
Q5. What is the best interview answer for how Docker helped the application layer, not just infra? Docker gave us deployment consistency across the orchestrator, observability components, and test environments. It also enabled per-container memory caches for hot manga preferences and stable runtime behavior for streaming chat workloads — things that are less comfortable in a pure function-only architecture.
Optimizations We Can Credibly Claim
- Multi-stage images instead of single-stage builds
- Fargate for steady traffic, Lambda for overflow during chapter drops and trending titles
- ECR as the standard registry with vulnerability scanning
- Per-container L1 cache on the application path, backed by Redis and DAX for shared caching
Better-Than-Naive Explanation
The naive answer is "we used Docker because everybody uses containers." The stronger answer: we used Docker because it fit the hybrid compute model. Containers handled the predictable manga browsing baseline well, and Lambda protected us from burst spikes on chapter drops. That gave us simpler operations than EKS and more stable runtime behavior than Lambda-only — especially important for WebSocket streaming sessions that can't tolerate cold starts mid-conversation.
Decision Table
| Dimension | Details |
|---|---|
| Why Fargate over EC2 | No cluster/patch/AMI management; operationally simpler for our service count |
| Why Fargate over EKS | Explicit evaluation — EKS is overkill for MangaAssist service count |
| Why NOT Lambda-only | WebSocket sessions + per-container memory caches + streaming don't fit stateless function model |
| Multi-stage build tradeoff | Slightly more complex Dockerfile vs smaller runtime image, faster pulls, reduced attack surface |
| Hybrid compute tradeoff | Two compute planes to manage vs cost/latency predictability for baseline + burst protection |
| Scale mechanism | Fargate holds steady baseline; Lambda absorbs chapter-drop spikes |
| Cache hierarchy | Per-container L1 → Redis/DAX → DynamoDB |
| Key outcome | Steady-state cost predictability + ability to handle 10x traffic spikes without Fargate overprovisioning |
Tradeoffs Discussed
| Option considered | Why rejected or scoped |
|---|---|
| EC2 + Auto Scaling | Adds AMI patching, capacity reservation complexity; no product advantage |
| EKS | Strong Kubernetes tooling but too much operational overhead for our service count |
| Lambda-only | Can't hold WebSocket connections or per-container caches; cold start on streaming path is unacceptable |
| Fargate-only (no Lambda overflow) | Would require overprovisioning Fargate to absorb uncapped burst — expensive |
| Single-stage Docker builds | Larger images, slower pulls, larger attack surface, no benefit over multi-stage |
Scale Planned
| Traffic type | Compute home | Reasoning |
|---|---|---|
| Daily baseline manga browsing | ECS Fargate | Predictable, benefits from long-lived container warm state |
| Chapter-drop spikes (10x burst) | Lambda overflow | Cheap short-lived burst capacity, no Fargate overprovisioning |
| WebSocket streaming sessions | ECS Fargate | Persistent connection state; Lambda connection limit not suitable |
| Background enrichment jobs | Lambda or async ECS task | Not on the user-facing hot path |
Intuition From This Scenario
The decision to use containers is really a decision about what your traffic profile looks like. If you have a predictable baseline with intermittent bursts, hybrid compute almost always beats pure serverless and pure always-on. The container gives you the warm runtime, the cache, and the stable connection. Lambda gives you the safety valve. Neither alone is the full answer for a chatbot that serves millions of manga fans.