HLD Interview Questions — Study Guide Index

Deep-dive answers to all 50 HLD interview questions, organized by topic.

Documents

#	Document	Topics	Questions
01	Architecture Overview & Core Components	System layers, WebSocket, API Gateway, auth, rate limiting, Lambda vs ECS, monolith vs microservices	Q1–Q5, Q21, Q40
02	Intent Classification & Orchestration	Intent catalog, DistilBERT classifier, fan-out routing, adding new intents	Q6, Q11, Q16, Q39
03	Conversation Memory & Context Management	DynamoDB schema, TTL, multi-turn context, circuit breaker, ElastiCache hot path	Q7, Q13, Q23
04	RAG Pipeline & LLM Response Generation	Offline indexing, OpenSearch, Bedrock APIs, hallucination prevention, model adapter	Q8, Q9, Q18, Q22, Q24, Q25
05	Recommendations, Personalization & Caching	Amazon Personalize, cold start, Redis caching, cache invalidation, feedback loop	Q10, Q16, Q29
06	Scalability, Performance & Cost	Traffic spikes, latency optimization, cost breakdown, multi-storefront	Q19, Q26, Q27, Q35
07	Fault Tolerance & Reliability	Circuit breaker, graceful degradation tiers, 99.95% SLA, chaos engineering	Q17, Q23, Q34, Q36
08	Security, Safety & Guardrails	Guardrails pipeline, Bedrock Guardrails, prompt injection defense, PII, GDPR delete	Q14, Q28, Q30, Q37
09	Analytics & Observability	Kinesis pipeline, 4-tier metrics, feedback loop, A/B testing prompts	Q15, Q20, Q33
10	Testing & Deployment Strategy	9-layer test strategy, golden set eval, chaos tests, LLM canary deployment	Q31, Q38
11	Architect-Level Strategy & Business Alignment	Flywheel, build vs. buy, ROI, 3-year evolution, competitive moat, shutdown criteria	Q41–Q50

Questions by Difficulty

Easy (Q1–Q10)

Q	Question	Document
Q1	What is the overall architecture of MangaAssist?	01
Q2	Why use WebSocket instead of REST for the chat interface?	01
Q3	How does the system authenticate users?	01
Q4	What is the role of API Gateway?	01
Q5	How does rate limiting work?	01
Q6	How does the intent classifier work?	02
Q7	How does conversation memory work?	03
Q8	How does the RAG pipeline work?	04
Q9	How does the LLM generate a response?	04
Q10	How do recommendations work?	05

Medium (Q11–Q25)

Q	Question	Document
Q11	How does the orchestrator fan out requests?	02
Q12	How does the system handle streaming responses?	01
Q13	How does circuit breaker pattern prevent cascading failures?	03
Q14	How does the guardrails pipeline work?	08
Q15	How does the analytics pipeline work?	09
Q16	How does caching work end-to-end?	05
Q17	What happens if the order service goes down?	07
Q18	How is OpenSearch populated and kept up to date?	04
Q19	How does the system handle a 10x traffic spike?	06
Q20	How are user feedback signals collected and used?	09
Q21	How does the system support multiple storefronts?	01
Q22	How does the system prevent hallucinated product recommendations?	04
Q23	What happens if DynamoDB is unavailable?	03
Q24	How do you choose between different LLM models?	04
Q25	How does RAG handle product catalog updates?	04

Hard (Q26–Q38)

Q	Question	Document
Q26	How do you optimize end-to-end latency?	06
Q27	What is the cost per conversation?	06
Q28	How do you protect against prompt injection?	08
Q29	How do async patterns improve performance?	05
Q30	How does the system handle PII?	08
Q31	How do you roll out a new LLM version safely?	10
Q33	How does A/B testing work for prompts?	09
Q34	How do you achieve 99.95% SLA?	07
Q35	How does the architecture scale from 100K to 10M conversations/day?	06
Q36	How does chaos engineering work?	07
Q37	How do you implement GDPR right-to-be-forgotten?	08
Q38	What is the end-to-end testing strategy before launch?	10
Q39	How do you add a new intent without breaking existing ones?	02

Architect Level (Q40–Q50)

Q	Question	Document
Q40	How would you design this differently if starting over today?	01
Q41	What flywheel effects does this create?	11
Q42	Chatbot vs. improved search — which is more valuable?	11
Q43	What are the three biggest risks?	11
Q44	How do you measure ROI?	11
Q45	How does the architecture evolve over 3 years?	11
Q46	How do you defend against a manga-specialized competitor?	11
Q47	What organizational challenges did this face?	11
Q48	Which 3 intents would you launch with first?	11
Q49	Build vs. buy — which components?	11
Q50	When would you shut this project down?	11

Quick Reference — Key Services Map

AWS Service	Used for	Document
Amazon Bedrock (Claude 3.5)	LLM response generation	04
Amazon SageMaker	Intent classifier hosting (DistilBERT)	02
Amazon DynamoDB	Conversation history, session state	03
Amazon OpenSearch Serverless	RAG vector index	04
Amazon ElastiCache (Redis)	Hot path caching	05
Amazon Personalize	Product recommendations	05
Amazon Kinesis	Analytics event streaming	09
Amazon Redshift	Long-term analytics	09
Amazon ECS Fargate	Core orchestration service	01
AWS Lambda	Event processors, lightweight handlers	01
AWS API Gateway	WebSocket management	01
AWS Step Functions	Complex multi-step workflows	02
Amazon Connect	Human agent escalation	07
AWS Fault Injection Simulator	Chaos testing	10