07: Domain 1 Scenario Map
Purpose
This file turns the Domain 1 exam outline into a scenario-design blueprint. Use it when you want to:
- decide which scenarios to build next for a skill,
- keep scenario packs aligned to the AWS AIP-C01 objectives,
- ask stronger architecture and review questions instead of only collecting notes.
Recommended Scenario Pattern
For each skill, keep the same structure:
03-scenarios-and-runbooks.md as the skill index
scenarios/01-...md, 02-...md, and so on as focused deep dives
- Questions inside every scenario:
- What changed in the business or traffic pattern?
- What architecture decision created the failure?
- Which AWS service, limit, or pattern matters most?
- What would you measure before and after the fix?
Task 1.1 Scenario Themes
| Skill |
Core scenario themes |
Questions to ask |
| 1.1.1 Architectural design |
wrong model tier, sync vs async mismatch, shared read/write bottlenecks, cold-start chains, context-budget failure |
Are we over-designing the critical path? Where should the FM be strong versus cheap? |
| 1.1.2 Technical proof-of-concept |
unrealistic eval set, no concurrency validation, protocol change after the PoC, bad cost assumptions, weak raw-input coverage |
What did the PoC actually prove? Which unknowns were left untested? |
| 1.1.3 Standardized components |
client drift, prompt drift, metrics drift, contract drift, IAM drift |
Which parts must be shared platform components instead of team-by-team code? |
Task 1.1 is now fully split into separate scenario files under each skill folder.
Task 1.2 Scenario Themes
| Skill |
Core scenario themes |
Questions to ask |
| 1.2.1 FM assessment and selection |
leaderboard winner fails business use case, benchmark does not match language mix, long-context need ignored |
Which benchmark is closest to the real workload? |
| 1.2.2 Dynamic model selection |
hardcoded model IDs, routing policy hidden in code, canary routing without observability |
Can we switch providers without touching the caller? |
| 1.2.3 Resilient AI systems |
regional outage, quota exhaustion, primary-model quality regression, fallback path too expensive |
What is the degraded mode, not just the happy path? |
| 1.2.4 FM customization lifecycle |
tuned model promoted without lineage, rollback path missing, stale tuned model outlives source data |
Who owns model retirement and rollback? |
Task 1.3 Scenario Themes
| Skill |
Core scenario themes |
Questions to ask |
| 1.3.1 Data validation |
malformed payloads reach the FM, schema drift in upstream data, missing quality thresholds |
What data should be rejected early instead of "handled" by the model? |
| 1.3.2 Multimodal processing |
OCR and transcript timing mismatch, image metadata lost, audio pipeline language mismatch |
Where does modality-specific preprocessing belong? |
| 1.3.3 Input formatting |
wrong Bedrock payload shape, chat history assembled incorrectly, tool-call format mismatch |
Which formatting rules are model-specific versus reusable? |
| 1.3.4 Input enhancement |
normalization changes meaning, entity extraction misses domain aliases, cleanup increases latency too much |
How do we know enrichment is helping rather than distorting? |
Task 1.4 Scenario Themes
| Skill |
Core scenario themes |
Questions to ask |
| 1.4.1 Vector database architecture |
wrong store for update pattern, metadata not queryable, vector-only design hurts filtering |
What retrieval pattern are we optimizing for? |
| 1.4.2 Metadata frameworks |
missing timestamps, weak ownership tags, inconsistent domain labels |
Which metadata fields change ranking quality the most? |
| 1.4.3 High-performance vector search |
shard hot spots, oversized indexes, poor tenant isolation |
What breaks first at 10x scale? |
| 1.4.4 Integration components |
wiki connector duplicates docs, access controls lost during sync, source identifiers not preserved |
Can we trace every chunk back to a source of truth? |
| 1.4.5 Data maintenance systems |
stale embeddings, full reindex where delta sync is needed, source deletions not propagated |
How do we prove the vector store is current? |
Task 1.5 Scenario Themes
| Skill |
Core scenario themes |
Questions to ask |
| 1.5.1 Document segmentation |
chunks too small to preserve meaning, chunks too large for reranking, headings lost |
What information boundary should define a chunk? |
| 1.5.2 Embedding solutions |
embedding model too expensive, dimension choice hurts recall, multilingual mismatch |
Are we optimizing for recall, cost, or latency? |
| 1.5.3 Vector search deployment |
managed KB too limited, pgvector under-provisioned, OpenSearch config too heavy for workload |
What is the simplest search stack that still meets the SLA? |
| 1.5.4 Advanced search |
hybrid scoring unbalanced, reranker too slow, keyword-only queries buried by semantic noise |
When should lexical matching outrank semantic similarity? |
| 1.5.5 Query handling |
query decomposition overfires, expansion creates drift, transformation hides user intent |
How do we know query rewriting improves retrieval instead of rewriting the problem? |
| 1.5.6 Consistent access mechanisms |
retrieval APIs vary by team, tool-calling contract unstable, MCP adapter missing metadata |
What retrieval contract should every FM-facing component rely on? |
Task 1.6 Scenario Themes
| Skill |
Core scenario themes |
Questions to ask |
| 1.6.1 Instruction frameworks |
prompt role confusion, output schema ignored, weak abstention policy |
Which instruction belongs in prompt text versus guardrails? |
| 1.6.2 Interactive AI systems |
memory grows without control, clarification loops stall, durable state mixed with chat state |
What must persist across turns, and for how long? |
| 1.6.3 Prompt governance |
prompt changed outside review, version not logged, rollback impossible |
Who can change a production prompt and how is it audited? |
| 1.6.4 Prompt QA |
regression suite too small, safety tests missing, output checks too brittle |
What failures should block promotion automatically? |
| 1.6.5 Prompt optimization |
endless prompt tweaking hides retrieval issues, A/B results noisy, token cost rises silently |
When has prompt work hit diminishing returns? |
| 1.6.6 Complex prompt systems |
monolithic prompt should be decomposed, branching logic untested, step outputs not validated |
Which tasks deserve a flow and which should stay single-step? |
Build Order
If you want to keep extending Domain 1 in a disciplined way, the next high-value scenario packs are:
- Task 1.2 for model routing and resilience
- Task 1.5 for retrieval quality and search architecture
- Task 1.6 for prompt governance and prompt-system reliability
That order keeps the study path close to how real systems fail: first architecture, then model operations, then retrieval, then prompt operations.