07: Domain 1 Scenario Map

Purpose

This file turns the Domain 1 exam outline into a scenario-design blueprint. Use it when you want to:

decide which scenarios to build next for a skill,
keep scenario packs aligned to the AWS AIP-C01 objectives,
ask stronger architecture and review questions instead of only collecting notes.

For each skill, keep the same structure:

03-scenarios-and-runbooks.md as the skill index
scenarios/01-...md, 02-...md, and so on as focused deep dives
Questions inside every scenario: - What changed in the business or traffic pattern? - What architecture decision created the failure? - Which AWS service, limit, or pattern matters most? - What would you measure before and after the fix?

Skill	Core scenario themes	Questions to ask
1.1.1 Architectural design	wrong model tier, sync vs async mismatch, shared read/write bottlenecks, cold-start chains, context-budget failure	Are we over-designing the critical path? Where should the FM be strong versus cheap?
1.1.2 Technical proof-of-concept	unrealistic eval set, no concurrency validation, protocol change after the PoC, bad cost assumptions, weak raw-input coverage	What did the PoC actually prove? Which unknowns were left untested?
1.1.3 Standardized components	client drift, prompt drift, metrics drift, contract drift, IAM drift	Which parts must be shared platform components instead of team-by-team code?

Task 1.1 is now fully split into separate scenario files under each skill folder.

Skill	Core scenario themes	Questions to ask
1.2.1 FM assessment and selection	leaderboard winner fails business use case, benchmark does not match language mix, long-context need ignored	Which benchmark is closest to the real workload?
1.2.2 Dynamic model selection	hardcoded model IDs, routing policy hidden in code, canary routing without observability	Can we switch providers without touching the caller?
1.2.3 Resilient AI systems	regional outage, quota exhaustion, primary-model quality regression, fallback path too expensive	What is the degraded mode, not just the happy path?
1.2.4 FM customization lifecycle	tuned model promoted without lineage, rollback path missing, stale tuned model outlives source data	Who owns model retirement and rollback?

Skill	Core scenario themes	Questions to ask
1.3.1 Data validation	malformed payloads reach the FM, schema drift in upstream data, missing quality thresholds	What data should be rejected early instead of "handled" by the model?
1.3.2 Multimodal processing	OCR and transcript timing mismatch, image metadata lost, audio pipeline language mismatch	Where does modality-specific preprocessing belong?
1.3.3 Input formatting	wrong Bedrock payload shape, chat history assembled incorrectly, tool-call format mismatch	Which formatting rules are model-specific versus reusable?
1.3.4 Input enhancement	normalization changes meaning, entity extraction misses domain aliases, cleanup increases latency too much	How do we know enrichment is helping rather than distorting?

Skill	Core scenario themes	Questions to ask
1.4.1 Vector database architecture	wrong store for update pattern, metadata not queryable, vector-only design hurts filtering	What retrieval pattern are we optimizing for?
1.4.2 Metadata frameworks	missing timestamps, weak ownership tags, inconsistent domain labels	Which metadata fields change ranking quality the most?
1.4.3 High-performance vector search	shard hot spots, oversized indexes, poor tenant isolation	What breaks first at 10x scale?
1.4.4 Integration components	wiki connector duplicates docs, access controls lost during sync, source identifiers not preserved	Can we trace every chunk back to a source of truth?
1.4.5 Data maintenance systems	stale embeddings, full reindex where delta sync is needed, source deletions not propagated	How do we prove the vector store is current?

Skill	Core scenario themes	Questions to ask
1.5.1 Document segmentation	chunks too small to preserve meaning, chunks too large for reranking, headings lost	What information boundary should define a chunk?
1.5.2 Embedding solutions	embedding model too expensive, dimension choice hurts recall, multilingual mismatch	Are we optimizing for recall, cost, or latency?
1.5.3 Vector search deployment	managed KB too limited, pgvector under-provisioned, OpenSearch config too heavy for workload	What is the simplest search stack that still meets the SLA?
1.5.4 Advanced search	hybrid scoring unbalanced, reranker too slow, keyword-only queries buried by semantic noise	When should lexical matching outrank semantic similarity?
1.5.5 Query handling	query decomposition overfires, expansion creates drift, transformation hides user intent	How do we know query rewriting improves retrieval instead of rewriting the problem?
1.5.6 Consistent access mechanisms	retrieval APIs vary by team, tool-calling contract unstable, MCP adapter missing metadata	What retrieval contract should every FM-facing component rely on?

Skill	Core scenario themes	Questions to ask
1.6.1 Instruction frameworks	prompt role confusion, output schema ignored, weak abstention policy	Which instruction belongs in prompt text versus guardrails?
1.6.2 Interactive AI systems	memory grows without control, clarification loops stall, durable state mixed with chat state	What must persist across turns, and for how long?
1.6.3 Prompt governance	prompt changed outside review, version not logged, rollback impossible	Who can change a production prompt and how is it audited?
1.6.4 Prompt QA	regression suite too small, safety tests missing, output checks too brittle	What failures should block promotion automatically?
1.6.5 Prompt optimization	endless prompt tweaking hides retrieval issues, A/B results noisy, token cost rises silently	When has prompt work hit diminishing returns?
1.6.6 Complex prompt systems	monolithic prompt should be decomposed, branching logic untested, step outputs not validated	Which tasks deserve a flow and which should stay single-step?

If you want to keep extending Domain 1 in a disciplined way, the next high-value scenario packs are:

That order keeps the study path close to how real systems fail: first architecture, then model operations, then retrieval, then prompt operations.