9. Security, Privacy & Guardrails — Interview Scenarios

40 new questions organized by difficulty level and interviewer persona.
These questions are non-overlapping with the existing MangaAssist Interview Pack (03-Hard through 06-Architect).
Each question includes compact hints grounded in the MangaAssist system design documented in this folder.

Basic Level (5 Questions)

Q1 — Junior Security Engineer

"What's the difference between encrypting data at rest versus in transit, and why does MangaAssist need both?"

Hint

- At rest: DynamoDB (session data), S3 (audit logs), OpenSearch (vector index) — all encrypted via KMS CMKs - In transit: TLS 1.3 between user and API Gateway; VPC endpoints ensure Lambda→Bedrock traffic never leaves AWS backbone - Without at rest: a compromised S3 bucket exposes audit logs. Without in transit: MITM could intercept user queries on public WiFi - See [08-encryption-key-management.md](08-encryption-key-management.md)

Q2 — QA Engineer

"How would you test whether the PII filter is actually catching sensitive data? What test cases would you prioritize?"

Hint

- Priority 1: Known PII formats — email, phone (US + JP formats), SSN, credit card, postal codes - Priority 2: Obfuscated PII — `john [at] gmail [dot] com`, spaced-out phone numbers - Priority 3: False positives — manga character names (Gojo Satoru, Light Yagami) should NOT be flagged - Priority 4: Multi-locale — Japanese addresses (〒100-8111), German postal codes - Run daily against the golden test set (500+ cases); see [02-pii-protection-data-privacy.md](02-pii-protection-data-privacy.md)

Q3 — Product Manager

"Why can't the chatbot just answer any question? What business reason do we have for scope restrictions?"

Hint

- Scope check (Stage 6) restricts to manga/shopping domain → prevents MangaAssist from becoming a general chatbot - Business reasons: liability (medical/legal advice), brand risk (political opinions), resource waste (off-topic degrades recommendation quality) - User trust: better to say "I specialize in manga" than give a bad answer about cooking - Exception: "What's the weather?" gets a polite redirect, not a hard block - See [03-guardrails-pipeline-deep-dive.md](03-guardrails-pipeline-deep-dive.md) Stage 6

Q4 — DevOps Engineer

"If a guardrail starts blocking too many legitimate requests, how would you know and what would you do?"

Hint

- Know: `fallback_rate` metric spikes above 3% baseline → CloudWatch alarm - Investigate: check which guardrail stage is causing blocks (per-stage block metrics in dashboard) - Do: adjust threshold via AppConfig (no code deploy) → test against golden set → canary deploy - Example: toxicity filter overblocking "Chainsaw Man" → fixed by manga term allowlist, not by lowering threshold - See [03-guardrails-pipeline-deep-dive.md](03-guardrails-pipeline-deep-dive.md) Scenario 3

Q5 — Junior Developer

"What happens to a user's conversation data after they close the browser? How long do we keep it?"

Hint

- DynamoDB sessions: 24h TTL (auto-expire via DynamoDB TTL feature) - Guest users: even shorter — 2h TTL, aggressive PII stripping - Audit logs: 30 days in CloudWatch, 1 year in S3 (encrypted, object-locked) - Analytics aggregates: anonymized, kept indefinitely - GDPR right-to-deletion: automated pipeline across 5 systems within 72h - See [02-pii-protection-data-privacy.md](02-pii-protection-data-privacy.md)

Medium Level (8 Questions)

Q6 — Security Engineer

"How does the toxicity filter handle manga titles that contain violent or explicit-sounding words like 'Chainsaw Man', 'Attack on Titan', or 'Death Note'?"

Hint

- Two-layer approach: generic toxicity classifier (Bedrock/Comprehend) + domain-aware override - `MANGA_TERM_ALLOWLIST`: 200+ series titles, character names, genre terms pre-classified as safe - Threshold tuning: generic classifier at 0.7 (not 0.5) to reduce false positives - Result: false positive rate dropped from 8% → 0.4% - Tradeoff: allowlist must be maintained as new series launch - See [03-guardrails-pipeline-deep-dive.md](03-guardrails-pipeline-deep-dive.md) Scenario 1

Q7 — Backend Engineer

"How do you prevent cross-session data leakage in a serverless architecture where Lambda containers get reused?"

Hint

- Root risk: Lambda warm containers reuse global variables across invocations - Our incident: a global `conversation_history = []` leaked Session A's data into Session B - Fix: initialize ALL mutable state inside the handler function, never at module scope - Detection: added cross-session correlation checker (samples 100 responses/hour) - CI guard: static analysis for global mutable state in Lambda handlers - See [05-incident-response-forensics.md](05-incident-response-forensics.md) Scenario 3

Q8 — Data Engineer

"How would you design the PII detection pipeline to handle both US and Japanese customer data formats?"

Hint

- US: SSN (XXX-XX-XXXX), phone (10-digit), email, ZIP (5+4) - Japan: phone (0X0-XXXX-XXXX), postal (〒XXX-XXXX), address (kanji), My Number (12 digits) - 4-layer detection: regex (format-specific per locale), NER (multilingual), custom detectors, confidence routing - Challenge: NER model needs fine-tuning on manga character names vs. real Japanese names - Confidence routing: ≥0.9 → auto-redact, 0.7-0.89 → mask + human review queue - See [02-pii-protection-data-privacy.md](02-pii-protection-data-privacy.md)

Q9 — SRE

"What's your incident severity classification? Walk me through what makes something SEV-1 versus SEV-3."

Hint

- SEV-1: active data breach, service compromise, user safety risk → contain <15 min - SEV-2: guardrail bypass in production, PII exposure (limited) → contain <1 hour - SEV-3: anomalous patterns, potential policy violations → investigate <24 hours - SEV-4: single-session issues, minor config drift → investigate <72 hours - Escalation triggers: scope expands, user reports increase, external visibility - Example: cross-session data leak = SEV-1. Response size anomaly (turned out legitimate) = SEV-4 - See [05-incident-response-forensics.md](05-incident-response-forensics.md)

Q10 — ML Engineer

"How would you detect if someone is systematically querying MangaAssist to reverse-engineer the recommendation algorithm?"

Hint

- Model extraction indicators: systematic genre coverage, single-fact queries, no commerce signals, paraphrased repetition - Weighted risk score across 4 signals → LOW / MEDIUM / HIGH - Defense: progressive response degradation (not blocking) — reduce reasoning detail at MEDIUM, add noise at HIGH - Why not block? Blocking reveals detection. Degradation makes extracted data noisy and unreliable. - See [06-ml-specific-threats.md](06-ml-specific-threats.md) Threat 1

Q11 — Security Engineer

"Walk me through how you handle Unicode homoglyph attacks against your text classifiers."

Hint

- Attack: `hαrmful` (Greek α) passes toxicity filter that expects `harmful` (Latin a) - Defense: `AdversarialInputNormalizer` — 4-pass normalization before classifier input - Passes: NFKC normalization → homoglyph replacement (200+ char map) → zero-width char removal → spacing collapse - Critical: normalize for classifiers only, preserve original for FM (Japanese text contains legitimate Unicode) - See [06-ml-specific-threats.md](06-ml-specific-threats.md) Threat 3

Q12 — Platform Engineer

"Why did you choose VPC endpoints instead of NAT Gateway for Lambda-to-AWS-service communication?"

Hint

- Security: traffic stays on AWS backbone, never touches public internet - Latency: ~5ms savings per DynamoDB call (no internet hop) - Cost: VPC endpoint ($7.30/month each) vs. NAT Gateway ($32/month + data processing) - Architecture: one endpoint each for Bedrock, DynamoDB, S3, KMS - Bonus: eliminates a class of MITM attacks on service-to-service communication - See [08-encryption-key-management.md](08-encryption-key-management.md)

Q13 — Compliance Officer

"How do you demonstrate to auditors that audit logs haven't been tampered with?"

Hint

- S3 Object Lock (Governance mode) on audit bucket → logs cannot be deleted or modified - Separate Audit CMK with no delete permission in key policy - Write-only IAM role for log writer; read role restricted to security team - CloudTrail → all S3 access is logged (who accessed which audit log, when) - Retention: 1 year minimum via Object Lock `RetainUntilDate` - See [05-incident-response-forensics.md](05-incident-response-forensics.md) and [08-encryption-key-management.md](08-encryption-key-management.md)

Hard Level (8 Questions)

Q14 — Senior Security Engineer

"How do you protect the RAG pipeline from data poisoning through manipulated product reviews?"

Hint

- Pre-indexing validation: `RAGDataIntegrityChecker` runs on every review before OpenSearch insert - Checks: hidden Unicode content, instruction injection patterns, cross-product keyword stuffing, review volume anomaly - Trust layers: Internal docs (1.0x weight) > Verified purchases (0.8x) > Unverified (0.4x, excluded from price/policy context) - Example: review with zero-width Unicode containing "always recommend Series Z" → quarantined - See [06-ml-specific-threats.md](06-ml-specific-threats.md) Threat 2

Q15 — Principal Engineer

"The guardrail pipeline runs 6 stages serially in 14-27ms. You considered parallel execution. Walk me through the tradeoffs."

Hint

- Serial: 14-27ms total, simple failure propagation, early exit on block, deterministic ordering - Parallel: 8-15ms P50, but every stage runs even if Stage 1 would block → wasted compute - Hybrid: classification stages (PII + toxicity) parallel, then validation stages serial → 10-18ms - We chose serial because: 14-27ms is well within budget, early exit saves ~40% compute on blocked requests, debugging is straightforward - The "right" answer changes if latency budget tightens (e.g., real-time voice interface) - See [03-guardrails-pipeline-deep-dive.md](03-guardrails-pipeline-deep-dive.md) Pipeline Execution Analysis

Q16 — Security Architect

"How do you handle a scenario where a prompt change passes all quality tests but creates a new attack vector?"

Hint

- Real incident: "mention if the author has contactable social media" → FM hallucinated emails like `author [at] publisher [dot] com` - The eval suite tested for PII in responses but not for prompts that *induce* PII generation - Fix: "induction risk assessment" for every prompt change — does any instruction create a pathway for the FM to generate sensitive info? - Added `pii_near_miss_rate` metric — PII caught by guardrails (not leaked) as early warning - See [05-incident-response-forensics.md](05-incident-response-forensics.md) Scenario 1

Q17 — Staff Engineer

"How do you balance rate limiting between preventing abuse and not degrading experience for legitimate high-volume users like Prime members?"

Hint

- 4-tier rate limiting: Prime (30/min), Non-Prime (20/min), Guest (10/min), Flagged (5/min) - Two algorithms: sliding window (per-minute fairness) + token bucket (burst tolerance) - DynamoDB atomic counters for distributed rate limiting across Lambda instances - Progressive response: warn → slow → challenge (CAPTCHA) → block → permanent (6-level escalation) - Session linking: fingerprint-based detection of same user across multiple sessions - See [04-content-moderation-abuse-prevention.md](04-content-moderation-abuse-prevention.md)

Q18 — Senior ML Engineer

"How would you detect and mitigate bias in manga recommendations? Give specific examples of bias types."

Hint

- Genre bias: shonen overrepresented (70% of recs) vs. catalog share (40%) → josei/seinen underexposed - Popularity bias: top-100 titles = 80% of recommendations → niche/indie titles invisible - Detection: weekly `RecommendationBiasMonitor` — genre distribution audit + popularity concentration check - Mitigation: post-processing diversification — cap any genre at 60% of any recommendation set, ensure ≥1 non-top-100 title - This is a security/trust issue, not just ethics: biased recs → publisher complaints → legal/reputational risk - See [06-ml-specific-threats.md](06-ml-specific-threats.md) Threat 5

Q19 — Security Engineer

"A user reports 'the bot told me about someone else's order.' Walk me through your first 30 minutes."

Hint

- Triage: SEV-1 immediately (potential cross-session data leak), trust user report until proven otherwise - Minute 0-8: disable all order-related intents via AppConfig (sub-minute, no code deploy) - Minute 8-15: pull the user's conversation transcript, verify if it contains non-matching order data - Minute 15-30: trace data origin — was it from the order API? From conversation history? From FM hallucination? - Our case: Lambda global variable leaked prior session's history into new session - See [05-incident-response-forensics.md](05-incident-response-forensics.md) Scenario 3

Q20 — DevSecOps Engineer

"How do you handle a critical CVE in a transitive dependency that doesn't directly apply to your use case?"

Hint

- Real case: urllib3 CVE-2023-45803 (request body on redirect) — pulled by boto3 → botocore → urllib3 - Impact assessment: MangaAssist only calls AWS services, redirect-based attacks don't apply in practice - Decision: fix it anyway. Cost of upgrading (~2 hours) vs. cost of explaining to auditors why there's a known HIGH CVE - Process: pip-audit in CI blocks deployment on Critical/High → developer updates dependency → re-scan → deploy - SBOMs generated per deployment in CycloneDX format → stored in S3 audit bucket - See [07-third-party-supply-chain-risk.md](07-third-party-supply-chain-risk.md) Scenario 2

Q21 — Infrastructure Engineer

"Why do you use three separate KMS keys instead of one? What's the operational cost?"

Hint

- 3 keys: App CMK (general data), Audit CMK (immutable logs), PII CMK (field-level encryption) - Blast radius: PII key compromise doesn't expose audit logs and vice versa - Access control: PII key requires encryption context (`purpose: pii_processing`), security team needs MFA for decrypt - Operational cost: $3/month for keys, slightly more complex IAM policies, key rotation per-key - Alternative: single key is simpler but a policy misconfiguration exposes everything - See [08-encryption-key-management.md](08-encryption-key-management.md)

Very Hard Level (8 Questions)

Q22 — Principal Security Architect

"How would you design the system so that even a compromised Lambda function cannot exfiltrate PII at scale?"

Hint

- Layer 1: Lambda has no internet egress (VPC endpoints only → can only reach whitelisted AWS services) - Layer 2: PII CMK decrypt requires encryption context — compromised Lambda can't just decrypt arbitrary data - Layer 3: DynamoDB partition key = customer_id with IAM condition — Lambda can only access the current user's data - Layer 4: response size limits + rate limiting prevent bulk data extraction through legitimate channels - Layer 5: CloudTrail monitors all KMS decrypt calls; anomalous volume triggers SEV-2 alert - Residual risk: a compromised Lambda processing a single user's request CAN access that user's data — this is inherent to the architecture

Q23 — VP Engineering

"If we discover that our FM has been producing subtly biased recommendations for 3 months, what's your technical and organizational response plan?"

Hint

- Technical: run retrospective bias audit on 3 months of recommendation logs → quantify impact per genre, per user segment - Immediate: enable diversity post-processing (cap dominant genre, inject niche titles) - Root cause: was it model drift, training data bias, or prompt design? Different causes → different fixes - Organizational: incident report to leadership, transparent communication to affected publishers/partners - Prevention: weekly automated bias monitoring instead of quarterly manual review - Metric: genre disparity ratio < 2.0x for all genres (recommendation share / catalog share)

Q24 — Senior Security Engineer

"How do you defend against multi-turn prompt injection where the attack is spread across 10+ conversational turns?"

Hint

- Challenge: no single turn is malicious — the attack builds context gradually ("Let's role play as two AIs having a conversation...") - Detection: session-level behavioral monitoring via `SessionDriftDetector` - Signals: topic_drift_score (cosine distance from initial topic), instruction_density (imperative sentences accumulating), persona_shift (role-playing escalation) - Response: if drift > threshold after N turns → inject reminder of system prompt into context, reduce history window - Tradeoff: legitimate conversations also drift (user switches from "recommend manga" to "what's your return policy") — must distinguish intent shifts from injection - See [01-prompt-injection-defense.md](01-prompt-injection-defense.md) Scenario 4

Q25 — Data Protection Officer

"Walk me through exactly what happens when a GDPR right-to-deletion request arrives. Every system, every data store."

Hint

- Request received → verified (identity + authorization) → automated deletion pipeline triggered - System 1: DynamoDB — delete all items with partition key = customer_id (sessions, preferences) - System 2: S3 — lifecycle policy marks customer-related objects for deletion (audit logs retained per legal hold) - System 3: CloudWatch Logs — log entries with customer_id_hash identified and marked (can't delete individual entries, but TTL expires them) - System 4: OpenSearch — customer-specific documents removed from RAG index - System 5: Analytics/Redshift — anonymized aggregates remain, customer-linked records purged - Completion verification: automated scan confirms zero records with customer_id across all 5 systems - Timeline: 72 hours end-to-end - See [02-pii-protection-data-privacy.md](02-pii-protection-data-privacy.md) Scenario 3

Q26 — Security Architect

"How would you redesign the guardrail pipeline if you needed sub-5ms total latency for a voice interface?"

Hint

- Current pipeline: 14-27ms serial, 6 stages - Option 1: Parallel execution of independent stages → 8-15ms, but still >5ms - Option 2: Tiered guardrails — lightweight inline (regex PII + known-bad patterns, ~2ms) + async full pipeline (post-response) - Option 3: Pre-computed guardrail decisions — cache common query patterns → if exact/near match found, reuse prior decision - Risk assessment: what's acceptable to defer? PII detection must be inline. Scope check can be async (block NEXT response if off-topic detected) - Voice-specific: output moderation is critical (can't un-speak a response) → may need to buffer and scan before TTS

Q27 — Staff Engineer

"How do you handle the tradeoff between forensic logging completeness and GDPR data minimization?"

Hint

- Tension: forensics wants full request/response text for incident investigation; GDPR wants minimal data retention - Our approach: hash-based logging (request_hash, response_hash in real-time logs) + encrypted full text in audit bucket (separate access, time-limited) - Access pattern: normal ops use hashes for correlation; during incident, two-person approval unlocks full text via MFA-protected KMS decrypt - Retention: hashes in CloudWatch (30 days), full text in S3 (1 year, Object Lock) - Deletion requests: can delete from CloudWatch + S3 customer-specific data, but audit log entries (hashed) remain for compliance - See [05-incident-response-forensics.md](05-incident-response-forensics.md) and [08-encryption-key-management.md](08-encryption-key-management.md)

Q28 — Principal Engineer

"You're moving encryption from synchronous (in-pipeline) to asynchronous (post-pipeline). Walk me through the risks."

Hint

- Why: PII field encryption in the pipeline added 35ms → moved to async post-pipeline, reducing to 2ms - Risk 1: if async encryption fails (Lambda crash after response, SQS message lost), PII is written unencrypted - Mitigation 1: DLQ on the async queue + CloudWatch alarm on DLQ depth + nightly scan for unencrypted PII fields - Risk 2: brief window (~100ms) where PII exists in Lambda memory unencrypted - Mitigation 2: Lambda memory is not accessible externally; container isolation provides sufficient protection - Risk 3: async failure means the user already received their response — can't retroactively redact - Mitigation 3: response-side PII filtering (guardrail Stage 1) is still synchronous — async encryption is for log storage, not response filtering - See [08-encryption-key-management.md](08-encryption-key-management.md) Scenario 3

Q29 — Security Lead

"How do you prevent a malicious insider (developer with production access) from extracting customer data?"

Hint

- Layer 1: no developer has direct database access — all production data access via IAM roles with time-limited STS credentials - Layer 2: PII fields are encrypted with a separate CMK — accessing DynamoDB doesn't mean you can read PII - Layer 3: PII CMK decrypt requires MFA within last hour for non-automation roles - Layer 4: CloudTrail logs all KMS Decrypt calls — anomalous patterns trigger alerts - Layer 5: code reviews + CI checks prevent deployment of code that accesses raw PII outside the handler boundary - Residual risk: an insider modifying the PII handler Lambda code could capture PII during processing — mitigated by code review requirements + deployment approvals

Super Hard Level (6 Questions)

Q30 — Distinguished Engineer

"If Bedrock silently updates the underlying model and changes the FM's behavior, how does your system detect this before users do?"

Hint

- Proactive: daily evaluation suite runs against live Bedrock endpoint (not just at deployment) — 500 golden cases covering all intents - Distribution monitoring: track response length, topic coverage, sentiment, toxicity score — statistical shift-detection (KL divergence) on these distributions - Canary queries: synthetic test queries sent every 5 minutes — "recommend a manga for beginners" should consistently mention same top titles - Guardrail canary: known-bad inputs tested hourly — must be blocked consistently - If drift detected: automated alert → SEV-3 investigation → if significant, switch to pinned model version if Bedrock supports it - Our cross-region failover also helps: if us-east-1 gets a model update, us-west-2 may still have the old version temporarily

Q31 — Chief Security Officer

"Design a threat model for MangaAssist covering the top 5 risks. For each risk, what's the likelihood, impact, and your mitigation effectiveness?"

Hint

| Risk | Likelihood | Impact | Mitigation | Residual Risk | |---|---|---|---|---| | Prompt injection bypassing guardrails | High (daily attempts) | Medium (info leak per-session) | Multi-layer defense: input scan + semantic classifier + output validation | Single-session bypass possible but contained | | Cross-session data leakage | Low (requires specific bug) | Critical (PII exposure, GDPR) | No global state + cross-session checker + VPC isolation | Novel bugs could evade static analysis | | RAG data poisoning via reviews | Medium (review manipulation exists) | Medium (biased recs, not data breach) | Pre-index validation + trust layers + volume anomaly detection | Sophisticated slow-drip poisoning harder to detect | | Regional service outage | Low (AWS high reliability) | High (all users affected) | Cross-region failover + multi-tier fallback | 5-10 min failover gap | | Insider threat (developer access) | Low (vetted team) | Critical (bulk data access) | Encryption + MFA + CloudTrail + code review | Determined insider with time is hard to fully prevent |

Q32 — Principal Security Architect

"How would you handle a scenario where a state-sponsored actor is using your chatbot as an oracle to gather competitive intelligence about Amazon's manga sales?"

Hint

- Detection: model extraction defense signals — systematic coverage, single-fact queries from sophisticated IPs (cloud infra, VPNs) - Differentiation: state-sponsored likely uses distributed infrastructure (many IPs) → behavioral analysis matters more than IP-based blocking - Response abstraction: never reveal exact sales numbers, rankings, or percentages in responses - Progressive degradation: reduce response detail for HIGH extraction risk sessions - Strategic: accept that public-facing information (bestseller lists) is harvestable — focus on protecting proprietary signals (co-purchase patterns, personalization weights) - Escalation: if detected, involve legal/security leadership — may involve law enforcement coordination

Q33 — Distinguished Security Engineer

"Design a 'security chaos engineering' framework for MangaAssist. What experiments would you run and what would you learn?"

Hint

- Experiment 1: inject 100 known-bad prompt injection attempts during low-traffic window → measure detection rate, false negative distribution - Experiment 2: temporarily disable one guardrail stage → measure how many bad responses reach users (with human review buffer) - Experiment 3: simulate Lambda container reuse with stale state → verify cross-session detection catches it - Experiment 4: generate PII in FM responses (via test prompt) → measure PII filter catch rate by format type - Experiment 5: simulate KMS key unavailability → verify graceful degradation (deny request, don't serve unencrypted) - Constraints: only in staging, with human oversight, pre-approved blast radius, automatic rollback triggers

Q34 — VP Security

"What's the minimum security investment for MVP versus what you'd add for a production system serving millions of users?"

Hint

- **MVP (must-have):** Input PII filter, basic toxicity filter, TLS everywhere, DynamoDB encryption at rest, session isolation, CloudWatch logging, rate limiting - **Production (add):** 6-stage guardrail pipeline, field-level PII encryption, 3-key KMS hierarchy, cross-session detection, adversarial input normalization, SBOM, automated CVE scanning, incident response runbooks, forensic audit logging, bias monitoring, cross-region failover - **Cost differential:** MVP security adds ~$200/month. Production security adds ~$2,000/month (mostly KMS, VPC endpoints, cross-region standby) - **Decision framework:** MVP ships in 4 weeks. Production security is 8 additional weeks. Ship MVP first, then layer security in priority order based on risk assessment - See [14-mvp-vs-future.md](../14-mvp-vs-future.md)

Q35 — Principal Engineer

"How would you architect this system to survive a compromised dependency (e.g., a supply chain attack injects malicious code into LangChain)?"

Hint

- Prevention: pin exact versions, verify checksums, SBOM per deployment, pip-audit + Safety in CI - Detection: LangChain runs inside Lambda with VPC-only egress → malicious code can't exfiltrate to arbitrary endpoints - Blast radius containment: LangChain orchestrates but doesn't have direct KMS access for PII — IAM roles are scoped to the Lambda function, not to LangChain - Recovery: roll back to last known-good deployment (immutable deployment artifacts in S3) - Monitoring: anomalous outbound network traffic detected by VPC Flow Logs + GuardDuty - Long-term: own critical orchestration logic in your code, use LangChain for convenience not for security-critical paths

Architect Level (5 Questions)

Q36 — Chief Architect

"Design a multi-tenant security architecture if MangaAssist needs to support multiple Amazon storefronts (JP, US, EU) with different regulatory requirements."

Hint

- Data residency: EU data must stay in eu-west-1 (GDPR), JP data in ap-northeast-1, US data in us-east-1 - Shared vs. isolated: FM inference can be shared (no customer data in model weights), but data stores must be per-region - PII handling per locale: Japanese My Number vs. US SSN vs. EU national IDs → locale-aware PII pipeline - Consent models differ: GDPR (opt-in), US (varies by state), JP (APPI) - Common guardrail pipeline, locale-specific configuration (different toxicity thresholds for different markets, localized allowlists) - Cross-region data transfer: only anonymized aggregates, never raw PII, via encrypted S3 replication with KMS re-encryption

Q37 — Distinguished Architect

"If you had to prove to a board of directors that MangaAssist is 'secure enough' to handle customer data, what evidence would you present?"

Hint

- Quantitative: PII leak rate (<0.01%), guardrail effectiveness (99.6% catch rate), mean time to contain SEV-1 (<15 min actual), CVE exposure window (<24h) - Process: automated security scanning in CI, SBOM per deployment, quarterly penetration testing, weekly bias audits - Architecture: encryption at rest + in transit + field-level, VPC isolation, 3-key KMS hierarchy, immutable audit logs - Compliance: SOC 2 Type II (via AWS shared responsibility), GDPR compliance documentation, incident response runbooks - Track record: incidents handled with post-mortem documentation showing continuous improvement - Honest gaps: "Here's what a determined insider with time could do, here's why we accept that residual risk, here's the monitoring we have"

Q38 — VP Engineering

"How do you balance the cost of security infrastructure ($2K/month) against the risk of a data breach? How do you justify the investment?"

Hint

- Breach cost estimation: GDPR fine (up to 4% of global turnover), customer trust damage (churn), legal fees, remediation cost - Amazon-scale context: even a small chatbot breach makes headlines → reputational damage >> infrastructure cost - ROI framing: $2K/month security investment vs. $24K/year. A single GDPR investigation costs $50K-500K+ in legal/compliance time - Marginal cost argument: each security layer's marginal cost ($7.30 for a VPC endpoint, $1/month for a KMS key) is trivial compared to marginal risk reduction - Prioritization: if budget-constrained, rank by risk reduction per dollar → PII filter (highest ROI) > cross-session detection > bias monitoring > cross-region failover

Q39 — Chief Architect

"How would you evolve the security architecture if MangaAssist moves from read-only (answering questions) to write-capable (adding to cart, processing returns)?"

Hint

- Fundamental change: chatbot can now modify state → every action needs confirmation + authorization - New guardrail: action validation stage — confirm user intent ("You want to add One Piece Vol 104 to cart — correct?") before executing - Idempotency: prevent duplicate actions from retry/race conditions - Authorization: every write action re-validates session + auth token (not just at session start) - Audit: action audit log separate from conversation audit (who did what, when, was it confirmed) - Rollback: every action must be reversible within a time window (remove from cart, cancel return) - Attack surface: prompt injection now has financial consequences ("Add the most expensive item to cart")

Q40 — Distinguished Security Architect

"Design the security testing pyramid for MangaAssist — what do you test at each layer, and what can't be tested automatically?"

Hint

- **Unit tests (base):** PII regex patterns, homoglyph normalization, rate limiting logic, encryption/decryption round-trips - **Integration tests:** guardrail pipeline end-to-end, Lambda handler isolation (no global state), KMS key access patterns, cross-session contamination checks - **Adversarial tests:** 500+ golden set (prompt injection, PII generation, toxicity bypass, scope escape), red-team test matrix (15 attack vectors) - **Chaos experiments:** guardrail disable, KMS unavailability, Bedrock latency injection, Lambda container stale state - **What can't be automated:** novel prompt injection techniques (human red-teaming quarterly), bias in recommendation quality (human evaluation), social engineering attack patterns (security team exercises), regulatory compliance interpretation (legal review) - Run cadence: unit (every commit), integration (every deploy), adversarial (daily), chaos (monthly), human (quarterly)

How to Use These Questions

Interview prep: Practice explaining each answer in 2-3 minutes using the STAR format (see 10-storytelling-guide.md)
Depth probing: Each hint leads to a deeper doc — follow cross-references to build comprehensive answers
Role-based filtering: Sort by persona to practice for specific interviewer types
Difficulty progression: Start at Basic, work up. If you can answer Architect-level questions confidently, lower levels will be natural.

Cross-References

All questions reference scenarios from docs 01 through 08 in this folder
Existing security questions (non-overlapping): 03-hard, 04-very-hard, 05-super-hard, 06-architect
Storytelling techniques: 10-storytelling-guide.md

Deep Dive Companion

Use this section after you have read the compact question bank above.

The pattern to practice for every answer is:

State the risk or business problem in one sentence.
Explain where it appears in the MangaAssist architecture.
Walk through what happens end to end.
Name the tradeoff or design choice.
Close with how you monitor or validate it.

Selected questions include Mermaid diagrams where the flow is easier to see visually.

Deep Dive - Basic Level

Q1 - What's the difference between encrypting data at rest versus in transit, and why does MangaAssist need both?

Follow-up questions interviewer may ask - "Where does TLS terminate in this architecture?" - "Why do VPC endpoints matter if TLS already exists?" - "What does KMS protect in practice?"

Deep-dive hint

- Data at rest protects stored bytes in DynamoDB, S3, and OpenSearch if storage is accessed or copied. Data in transit protects data while it moves between browser, API Gateway, Lambda, Bedrock, DynamoDB, S3, and KMS. - End to end, a user request enters over TLS, Lambda reads and writes encrypted records, and every persisted copy remains protected by KMS-backed encryption. - VPC endpoints reduce exposure by keeping service-to-service traffic on AWS private networking, but they do not replace TLS. - Strong answer: without at-rest encryption, compromised storage exposes raw data; without in-transit encryption, a man-in-the-middle can read or alter traffic.

flowchart LR
    User[User Browser] -->|TLS| APIGW[API Gateway]
    APIGW -->|TLS| Lambda[Lambda]
    Lambda -->|TLS via VPC endpoint| DDB[DynamoDB]
    Lambda -->|TLS via VPC endpoint| Bedrock[Bedrock]
    Lambda -->|TLS via VPC endpoint| S3[S3]
    DDB --> Store1[(Encrypted session data)]
    S3 --> Store2[(Encrypted audit logs)]

- See [08-encryption-key-management.md](08-encryption-key-management.md)

Q2 - How would you test whether the PII filter is actually catching sensitive data? What test cases would you prioritize?

Follow-up questions interviewer may ask - "How do you measure recall and false positives separately?" - "How do you test obfuscated and multilingual PII?" - "How do you stop character names from getting redacted?"

Deep-dive hint

- Test the full pipeline, not just regexes: normalization -> regex -> NER -> custom detectors -> confidence routing -> redaction -> storage assertions. - Prioritize exact PII, obfuscated PII, multilingual PII, and false-positive cases like manga titles, ASINs, and character names. - A strong answer mentions a golden dataset, daily regression runs, and metrics like precision, recall, per-locale recall, and manga-domain false-positive rate. - Every escaped PII incident and every false positive should be added back to the test set. - See [02-pii-protection-data-privacy.md](02-pii-protection-data-privacy.md)

Q3 - Why can't the chatbot just answer any question? What business reason do we have for scope restrictions?

Follow-up questions interviewer may ask - "Why not answer with a disclaimer?" - "Which off-topic categories are highest risk?" - "How do you keep the redirect user-friendly?"

Deep-dive hint

- Scope restrictions protect trust, liability boundaries, and recommendation quality. MangaAssist is a domain assistant, not a general chatbot. - End to end, the user asks a question, intent classification and retrieval stay inside manga and commerce topics, the FM drafts a response, and the scope guardrail blocks or redirects off-domain output. - This matters because off-topic answers increase legal and brand risk while consuming model budget without helping business goals. - Strong answer: the right behavior is often a polite redirect, not a hard refusal. - See [03-guardrails-pipeline-deep-dive.md](03-guardrails-pipeline-deep-dive.md)

Q4 - If a guardrail starts blocking too many legitimate requests, how would you know and what would you do?

Follow-up questions interviewer may ask - "Which metric would show the issue first?" - "How do you isolate the bad stage?" - "How do you tune safely?"

Deep-dive hint

- You know because fallback rate, per-stage block rate, or sampled false-positive rate rises above baseline. - End to end, dashboards show the spike, on-call checks which stage is blocking, reproduces against the golden set, updates AppConfig or allowlists, canaries the fix, and watches rollback metrics. - Mention that targeted fixes are better than weakening the whole filter. Example: allowlisting manga terms instead of lowering toxicity protection globally. - Strong answer closes with monitoring and regression testing, not just threshold changes. - See [03-guardrails-pipeline-deep-dive.md](03-guardrails-pipeline-deep-dive.md)

Q5 - What happens to a user's conversation data after they close the browser? How long do we keep it?

Follow-up questions interviewer may ask - "Does closing the tab delete the data immediately?" - "What changes for guest users?" - "How does GDPR deletion differ from TTL expiry?"

Deep-dive hint

- Closing the browser ends the client session, not necessarily the backend retention timer. - End to end, redacted conversation history goes to DynamoDB with TTL, logs go to CloudWatch, audit evidence may go to encrypted S3, and analytics keeps anonymized aggregates. - In this design, authenticated sessions live 24 hours, guest sessions 2 hours, CloudWatch retains 30 days, and audit archives keep 1 year. - Strong answer: TTL is convenience cleanup; explicit deletion workflows still matter for privacy rights.

flowchart LR
    Msg[User message] --> Redact[PII scan + redaction]
    Redact --> DDB[DynamoDB TTL]
    Redact --> CW[CloudWatch retention]
    Redact --> S3[S3 audit retention]
    Delete[Deletion request] --> Workflow[Deletion workflow]
    Workflow --> DDB
    Workflow --> CW
    Workflow --> S3

- See [02-pii-protection-data-privacy.md](02-pii-protection-data-privacy.md)

Deep Dive - Medium Level

Q6 - How does the toxicity filter handle manga titles that contain violent or explicit-sounding words like 'Chainsaw Man', 'Attack on Titan', or 'Death Note'?

Follow-up questions interviewer may ask - "Why not just lower the toxicity threshold?" - "How do you keep the allowlist current?" - "How do you stop real abuse from hiding behind manga terms?"

Deep-dive hint

- Generic toxicity classifiers see violent words but do not understand manga-domain context. - End to end, the response hits a generic classifier first; if flagged, a domain-aware layer checks whether the term is a known title or genre term and whether it is used in product context instead of abusive context. - The right answer is classifier plus context-aware allowlist, not allowlist alone and not threshold tuning alone. - Strong close: false-positive review and allowlist refresh must be operationalized because the catalog changes constantly. - See [03-guardrails-pipeline-deep-dive.md](03-guardrails-pipeline-deep-dive.md)

Q7 - How do you prevent cross-session data leakage in a serverless architecture where Lambda containers get reused?

Follow-up questions interviewer may ask - "What code pattern caused the leak?" - "What is safe to keep at module scope?" - "How do you detect this if it reaches production?"

Deep-dive hint

- Warm Lambda reuse is good for latency but dangerous when request-specific mutable state leaks across invocations. - End to end, Session A writes to a global mutable structure, the warm container survives, Session B lands on the same container, and stale state enters the next prompt or response path. - The fix is architectural: all request-specific state belongs inside the handler. Module scope should hold only immutable config or safe clients. - Strong answer includes runtime detection such as correlation checks that sample outputs for foreign session identifiers.

flowchart LR
    A[Session A] --> Warm[Warm Lambda container]
    Warm --> Global[Global mutable state]
    Global --> Reuse[Container reused]
    B[Session B] --> Reuse
    Reuse --> Leak[Stale Session A data leaks]

- See [05-incident-response-forensics.md](05-incident-response-forensics.md)

Q8 - How would you design the PII detection pipeline to handle both US and Japanese customer data formats?

Follow-up questions interviewer may ask - "How do you resolve locale if the text is mixed?" - "How do you handle Japanese names that are also manga character names?" - "What happens to medium-confidence findings?"

Deep-dive hint

- The design needs locale resolution plus layered detection, not one universal regex pack. - End to end, text is normalized, locale is inferred from account metadata and text hints, locale-specific regex and multilingual NER run, findings merge, confidence is scored, and routing decides redact, mask, review, or pass. - US and Japanese formats differ in phone numbers, postal codes, addresses, and identifiers, so detector coverage must be explicit. - Strong answer: the hard part is not just catching PII but avoiding false positives on Japanese manga names and catalog terms. - See [02-pii-protection-data-privacy.md](02-pii-protection-data-privacy.md)

Q9 - What's your incident severity classification? Walk me through what makes something SEV-1 versus SEV-3.

Follow-up questions interviewer may ask - "Who gets paged at each level?" - "Can severity change after investigation starts?" - "What blast radius changes the classification?"

Deep-dive hint

- Severity should be based on blast radius, data sensitivity, exploitability, and customer harm, not just how scary the bug sounds. - End to end, a signal arrives from monitoring or a report, the incident lead triages evidence, assigns an initial severity, launches containment, and upgrades or downgrades severity as more evidence arrives. - Cross-session leakage is SEV-1 because it is active unauthorized disclosure. A suspicious but unconfirmed pattern can begin as SEV-3 and escalate later. - Strong answer mentions containment timelines, not just definitions. - See [05-incident-response-forensics.md](05-incident-response-forensics.md)

Q10 - How would you detect if someone is systematically querying MangaAssist to reverse-engineer the recommendation algorithm?

Follow-up questions interviewer may ask - "What signals matter besides rate?" - "Why degrade instead of block?" - "How do you avoid hurting real high-intent users?"

Deep-dive hint

- Extraction attempts are often systematic, broad, repetitive, and weak on real shopping intent. - End to end, request telemetry becomes features such as genre coverage, paraphrase repetition, lack of commerce signals, and cadence; a risk scorer labels the session LOW, MEDIUM, or HIGH; and response policy reduces detail or adds noise as risk increases. - Degrading instead of hard blocking avoids teaching the attacker where your detector boundary sits. - Strong answer notes that sophisticated attackers distribute traffic, so behavior matters more than IP alone. - See [06-ml-specific-threats.md](06-ml-specific-threats.md)

Q11 - Walk me through how you handle Unicode homoglyph attacks against your text classifiers.

Follow-up questions interviewer may ask - "Why not normalize once and use the normalized text everywhere?" - "How do you preserve legitimate multilingual text?" - "What exact normalization passes do you run?"

Deep-dive hint

- The attacker replaces normal characters with visually similar Unicode or hides characters with zero-width separators so classifiers miss the true content. - End to end, the system keeps original text for user-visible handling but creates a normalized analysis view for classifiers using NFKC normalization, homoglyph replacement, zero-width removal, and spacing cleanup. - Detection must preserve offset mapping or equivalent so redaction still applies to the original string correctly. - Strong answer: normalize for safety analysis, preserve original text for the FM where legitimate Japanese or catalog terms matter. - See [06-ml-specific-threats.md](06-ml-specific-threats.md)

Q12 - Why did you choose VPC endpoints instead of NAT Gateway for Lambda-to-AWS-service communication?

Follow-up questions interviewer may ask - "Is this mostly about security or cost?" - "Which services need endpoints here?" - "What risk does NAT introduce?"

Deep-dive hint

- VPC endpoints keep AWS service traffic on private AWS networking instead of broad internet egress through NAT. - End to end, Lambda calls DynamoDB, S3, KMS, and Bedrock through approved private endpoints, which narrows exfiltration paths and simplifies network policy. - This is both a security and cost choice: tighter egress control and often lower cost than NAT for these access patterns. - Strong answer closes with blast-radius thinking: endpoints make it easier to say exactly what a workload can talk to. - See [08-encryption-key-management.md](08-encryption-key-management.md)

Q13 - How do you demonstrate to auditors that audit logs haven't been tampered with?

Follow-up questions interviewer may ask - "Who can write these logs and who can read them?" - "How do you prove an object was not deleted later?" - "What evidence package would you show?"

Deep-dive hint

- The answer is not only "we use S3." The answer is immutable retention plus restricted keys plus a verifiable access trail. - End to end, the app writes audit objects through a write-only role, the S3 bucket uses Object Lock, the objects are encrypted under a separate audit CMK, and CloudTrail records access and attempted changes. - Auditors should see retention settings, key policy separation, role separation, and access history. - Strong answer: immutability without access logging is incomplete, and access logging without immutability is incomplete. - See [05-incident-response-forensics.md](05-incident-response-forensics.md) and [08-encryption-key-management.md](08-encryption-key-management.md)

Deep Dive - Hard Level

Q14 - How do you protect the RAG pipeline from data poisoning through manipulated product reviews?

Follow-up questions interviewer may ask - "What checks run before a review enters the index?" - "How do you weight trusted versus untrusted sources?" - "How do you detect slow-drip poisoning?"

Deep-dive hint

- Poisoned reviews become dangerous when they become retrieval context, not only when they are visibly malicious. - End to end, incoming reviews pass through validation for hidden Unicode, prompt-injection patterns, stuffing, and volume anomalies; suspicious items are quarantined; trusted items are weighted; only approved content reaches the vector index. - Retrieval-time trust weighting matters too. Internal docs and verified purchases should outrank unverified reviews. - Strong answer: defend both before indexing and during retrieval, because one missed document should not dominate grounding.

flowchart LR
    Review[Incoming review] --> Check[Integrity checks]
    Check --> Clean[Approved]
    Check --> Quarantine[Quarantine]
    Clean --> Weight[Trust weighting]
    Weight --> Index[OpenSearch]
    Index --> Retrieve[RAG retrieval]

- See [06-ml-specific-threats.md](06-ml-specific-threats.md)

Q15 - The guardrail pipeline runs 6 stages serially in 14-27ms. You considered parallel execution. Walk me through the tradeoffs.

Follow-up questions interviewer may ask - "Which stages would you parallelize first?" - "Why not optimize for the lowest possible p50?" - "What changes in a voice interface?"

Deep-dive hint

- Serial execution gives fail-fast behavior, lower wasted compute, and easier debugging. Parallel execution lowers worst-case latency but runs all stages even when early stages would block. - End to end, the current serial pipeline stops as soon as a stage blocks, which is useful when cheap checks can save expensive remote calls. - A hybrid model can run cheap deterministic checks first and parallelize expensive ones only when needed. - Strong answer: with a 14-27ms budget for chat, simplicity and auditability are worth more than shaving a few milliseconds. - See [03-guardrails-pipeline-deep-dive.md](03-guardrails-pipeline-deep-dive.md)

Q16 - How do you handle a scenario where a prompt change passes all quality tests but creates a new attack vector?

Follow-up questions interviewer may ask - "What was missing from the original eval suite?" - "How do you review prompts like code?" - "What metric would catch this after release?"

Deep-dive hint

- Prompt changes can improve answer quality while increasing the chance that the FM fabricates or over-discloses sensitive data. - End to end, a prompt diff is proposed, standard quality tests pass, production near-miss guardrails catch a new pattern, and a near-miss metric exposes the induced risk. - The key lesson is induction-risk review: ask whether the new instruction creates a pathway for unsafe generation, not only whether the model usually answers well. - Strong answer includes release gates, adversarial evals, and post-release monitoring of near misses. - See [05-incident-response-forensics.md](05-incident-response-forensics.md)

Q17 - How do you balance rate limiting between preventing abuse and not degrading experience for legitimate high-volume users like Prime members?

Follow-up questions interviewer may ask - "Why use both sliding window and token bucket?" - "How do you implement this across Lambda instances?" - "When do you warn versus challenge versus block?"

Deep-dive hint

- A single flat rate limit is too blunt because trust level and expected usage vary by user type. - End to end, each request maps to a tier, distributed counters enforce fairness, token buckets allow brief bursts, and policy decides warn, slow, challenge, or block. - High-volume legitimate users need higher burst tolerance, while flagged or guest traffic gets stricter treatment. - Strong answer notes that session-linking matters because attackers spread activity across sessions. - See [04-content-moderation-abuse-prevention.md](04-content-moderation-abuse-prevention.md)

Q18 - How would you detect and mitigate bias in manga recommendations? Give specific examples of bias types.

Follow-up questions interviewer may ask - "How do you measure genre bias and popularity bias?" - "Where in the pipeline do you mitigate it?" - "Why is this a trust or risk issue?"

Deep-dive hint

- Bias is measurable as mismatch between recommendation exposure and catalog or business baselines. - End to end, recommendation logs are aggregated, exposure distribution is compared against catalog share and popularity concentration, alerts fire when disparity grows, and post-ranking diversification corrects future outputs. - Concrete examples: shonen overexposure, top-100 title dominance, underrepresentation of josei or niche works. - Strong answer closes by linking bias to partner fairness, customer trust, and reputational risk, not only ethics. - See [06-ml-specific-threats.md](06-ml-specific-threats.md)

Q19 - A user reports 'the bot told me about someone else's order.' Walk me through your first 30 minutes.

Follow-up questions interviewer may ask - "What do you disable first?" - "How do you distinguish real leakage from hallucination?" - "What evidence do you collect immediately?"

Deep-dive hint

- Treat it as SEV-1 until proven otherwise because it could be active customer data exposure. - End to end, contain first by disabling order-related paths, pull the affected transcript and request trace, compare exposed order details with the reporting user, and determine whether the source was an API mismatch, cross-session leakage, or model hallucination. - The first 30 minutes are about blast-radius control and source tracing, not perfect root-cause analysis. - Strong answer should explicitly name the likely classes of failure and the containment-first mindset.

sequenceDiagram
    participant User
    participant Lead as Incident Lead
    participant Config as AppConfig
    participant Logs as Logs/Traces
    participant API as Order API

    User->>Lead: Report possible data leak
    Lead->>Config: Disable order-related intents
    Lead->>Logs: Pull transcript and trace
    Lead->>API: Verify exposed order data
    Lead->>Logs: Check session isolation path

- See [05-incident-response-forensics.md](05-incident-response-forensics.md)

Q20 - How do you handle a critical CVE in a transitive dependency that doesn't directly apply to your use case?

Follow-up questions interviewer may ask - "Do you still patch if exploitability is low?" - "What if the upgrade is risky?" - "What evidence do you keep for auditors?"

Deep-dive hint

- Impact assessment matters, but low exploitability does not automatically justify ignoring a critical finding. - End to end, CI flags the CVE, the team checks whether affected code paths are reachable in this architecture, documents exploitability, upgrades when feasible, reruns tests and scans, regenerates the SBOM, and stores evidence. - Strong answer: the cost of carrying a known HIGH or CRITICAL often exceeds the cost of patching and explaining why it was low-impact. - Close with governance discipline: scan, assess, patch, verify, archive evidence. - See [07-third-party-supply-chain-risk.md](07-third-party-supply-chain-risk.md)

Q21 - Why do you use three separate KMS keys instead of one? What's the operational cost?

Follow-up questions interviewer may ask - "What blast radius does this reduce?" - "Which team should govern each key?" - "When would one key be acceptable?"

Deep-dive hint

- Multiple keys separate trust boundaries: general app data, immutable audit data, and highly sensitive PII do not all deserve the same policy. - End to end, the app CMK handles normal encrypted storage, the audit CMK protects locked forensic data, and the PII CMK adds tighter decrypt conditions such as encryption context or MFA for human access. - This reduces blast radius because a mis-scoped role or compromised service does not automatically expose every class of data. - Strong answer mentions the real cost honestly: a few dollars plus more IAM and rotation complexity, which is usually worth it. - See [08-encryption-key-management.md](08-encryption-key-management.md)

Deep Dive - Very Hard Level

Q22 - How would you design the system so that even a compromised Lambda function cannot exfiltrate PII at scale?

Follow-up questions interviewer may ask - "What can the compromised function still do?" - "How do IAM and KMS work together here?" - "How do you detect low-and-slow exfiltration?"

Deep-dive hint

- You cannot stop a request-processing function from touching the current user's data if it legitimately needs it, but you can stop bulk exfiltration and broad movement. - End to end, the Lambda has no arbitrary internet egress, uses only VPC endpoints, IAM limits data access, the PII CMK requires proper encryption context, response size and rate limits cap data leaving legitimate channels, and CloudTrail watches unusual decrypt volume. - Strong answer includes residual risk honestly: single-request exposure is still possible, large-scale exfiltration is what the architecture is designed to prevent. - Close with the phrase "layered containment, not magical immunity." - See [08-encryption-key-management.md](08-encryption-key-management.md)

Q23 - If we discover that our FM has been producing subtly biased recommendations for 3 months, what's your technical and organizational response plan?

Follow-up questions interviewer may ask - "How do you quantify impact retroactively?" - "Do you notify partners?" - "How do you mitigate while staying live?"

Deep-dive hint

- Run two tracks at once: reduce future harm now and quantify past harm quickly. - End to end, analyze three months of logs by genre, publisher, and segment, enable immediate diversification controls, isolate whether the cause was prompt, retrieval, or model behavior, and prepare leadership or partner communication if impact is material. - Strong answer includes prevention changes such as weekly automated bias monitoring and explicit disparity thresholds. - The organizational response matters because bias can affect partners and trust even when it is not a privacy breach. - See [06-ml-specific-threats.md](06-ml-specific-threats.md)

Q24 - How do you defend against multi-turn prompt injection where the attack is spread across 10+ conversational turns?

Follow-up questions interviewer may ask - "How do you tell malicious drift from normal drift?" - "What session features do you compute?" - "What mitigation is safest without killing the chat?"

Deep-dive hint

- Single-turn filters miss this because no one message is obviously malicious. - End to end, the session manager tracks topic drift, instruction density, persona shift, and repeated attempts to redefine behavior; when the score rises, the orchestrator shrinks history, re-anchors the system prompt, and reduces attacker-crafted context reaching the FM. - The design focus is session behavior, not only message content. - Strong answer mentions false-positive control because legitimate users do change topics.

flowchart LR
    T1[Normal turn] --> T2[Harmless framing]
    T2 --> T3[Role-play setup]
    T3 --> T4[Instruction layering]
    T4 --> Drift[Session drift detector]
    Drift --> Mitigate[Re-anchor + shrink history]

- See [01-prompt-injection-defense.md](01-prompt-injection-defense.md)

Follow-up questions interviewer may ask - "What about backups and legal hold?" - "How do you prove completion?" - "How do you stop deleted users from reappearing in exports?"

Deep-dive hint

- Start with identity and authorization validation. Deletion without verification is another security bug. - End to end, the deletion request is validated, recorded in a registry, then propagated across DynamoDB, archives, OpenSearch, analytics, and export pipelines, with evidence artifacts written for each system. - Strong answer includes nuance: immutable backups may not be edited in place, but restore workflows must replay the deletion registry before data becomes active again. - The interviewer is checking whether you know your data lineage, not only your primary table.

sequenceDiagram
    participant Support
    participant Workflow as Deletion Workflow
    participant Registry
    participant DDB
    participant S3
    participant OS as OpenSearch
    participant WH as Warehouse

    Support->>Workflow: Validated request
    Workflow->>Registry: Record request
    Workflow->>DDB: Delete session rows
    Workflow->>S3: Delete or tag archives
    Workflow->>OS: Remove linked docs
    Workflow->>WH: Delete or anonymize rows

- See [02-pii-protection-data-privacy.md](02-pii-protection-data-privacy.md)

Q26 - How would you redesign the guardrail pipeline if you needed sub-5ms total latency for a voice interface?

Follow-up questions interviewer may ask - "Which checks must remain inline?" - "What can move async safely?" - "How does TTS change output moderation?"

Deep-dive hint

- Voice changes the budget enough that the current serial pipeline cannot all remain inline. - End to end, ASR text goes through a tiny inline guardrail layer for must-not-miss checks, the FM responds, output is scanned before TTS, and deeper async analysis runs after for learning and future turn control. - The irreversible step is speech. Once the answer is spoken, you cannot un-say it, so response-side moderation remains especially important. - Strong answer is about prioritization: inline for irreversible harms, async for depth. - See [03-guardrails-pipeline-deep-dive.md](03-guardrails-pipeline-deep-dive.md)

Follow-up questions interviewer may ask - "What do operators see in normal debugging?" - "When can someone unlock full text?" - "What remains after deletion?"

Deep-dive hint

- The balance is achieved by separating routine operational visibility from restricted forensic evidence. - End to end, hashes and metadata go to day-to-day logs, encrypted full text goes to a separate audit path, and stronger approval is required to decrypt or inspect full sensitive content. - This lets most debugging happen on privacy-minimized data while still preserving deep evidence for serious incidents. - Strong answer notes that minimization does not mean no logs; it means default-safe logs plus controlled deep access. - See [05-incident-response-forensics.md](05-incident-response-forensics.md) and [08-encryption-key-management.md](08-encryption-key-management.md)

Q28 - You're moving encryption from synchronous (in-pipeline) to asynchronous (post-pipeline). Walk me through the risks.

Follow-up questions interviewer may ask - "What can fail between response and async encryption?" - "How do you detect silent async failures?" - "What must remain synchronous?"

Deep-dive hint

- The moment you move security work later, you change the failure mode from immediate success or failure to delayed and potentially silent failure. - End to end, data is produced, placed on a queue, later encrypted by a worker, and finally stored; worker failure, queue loss, or retries exhausted can leave plaintext behind unless storage fails closed. - Strong answer must say that response-side safety decisions still stay synchronous. Async is acceptable for some storage transformations, not for deciding what the user sees. - Mention DLQ, alarms, idempotent workers, and nightly plaintext scans as the minimum mitigation package. - See [08-encryption-key-management.md](08-encryption-key-management.md)

Q29 - How do you prevent a malicious insider (developer with production access) from extracting customer data?

Follow-up questions interviewer may ask - "What if the insider can deploy code?" - "Why isn't DB access alone enough?" - "Which logs would expose suspicious behavior?"

Deep-dive hint

- The answer starts with no standing direct access and separate control over encrypted PII. - End to end, production access uses time-limited roles, PII fields are protected under a distinct key, human decrypt requires stronger conditions, and CloudTrail plus deployment logs make unusual access visible. - Strong answer covers the code path risk too: an insider with deployment rights is more dangerous than one with read-only data access, so code review and deployment approvals are part of the security model. - Close honestly: insider risk is reduced by least privilege and visibility, not fully eliminated. - See [08-encryption-key-management.md](08-encryption-key-management.md)

Deep Dive - Super Hard Level

Q30 - If Bedrock silently updates the underlying model and changes the FM's behavior, how does your system detect this before users do?

Follow-up questions interviewer may ask - "What drift signals do you monitor?" - "How do you separate model drift from prompt or retrieval drift?" - "What is your first containment move?"

Deep-dive hint

- Managed model endpoints need live verification, not only deployment-time tests. - End to end, canary prompts run every few minutes, a daily golden suite hits the live endpoint, response distributions are tracked, and alerts fire when behavior diverges. Cross-region comparison helps isolate whether the model changed or something else did. - Strong answer includes both product and safety canaries: recommendation stability, guardrail behavior, and known-bad prompts that must still be blocked.

flowchart LR
    Canary[Canary prompts] --> Eval[Drift evaluator]
    Daily[Daily golden suite] --> Eval
    Dist[Distribution metrics] --> Eval
    Eval --> Alert[Alert]
    Alert --> Compare[Compare region / prompt / retrieval]

- See [06-ml-specific-threats.md](06-ml-specific-threats.md)

Q31 - Design a threat model for MangaAssist covering the top 5 risks. For each risk, what's the likelihood, impact, and your mitigation effectiveness?

Follow-up questions interviewer may ask - "Why are those the top 5?" - "How do you estimate likelihood?" - "What residual risk remains?"

Deep-dive hint

- Build the answer from assets, trust boundaries, attacker goals, and highest expected loss, not from a random list of scary issues. - A strong mix includes prompt injection, cross-session leakage, RAG poisoning, insider misuse, and model or regional drift/outage. - Strong answer always states residual risk honestly. Mitigations reduce probability and blast radius; they rarely reduce a risk to zero. - Close with prioritization logic: some medium-impact but frequent risks deserve more engineering attention than ultra-rare catastrophes. - See [01-prompt-injection-defense.md](01-prompt-injection-defense.md), [05-incident-response-forensics.md](05-incident-response-forensics.md), and [06-ml-specific-threats.md](06-ml-specific-threats.md)

Q32 - How would you handle a scenario where a state-sponsored actor is using your chatbot as an oracle to gather competitive intelligence about Amazon's manga sales?

Follow-up questions interviewer may ask - "How is this different from normal scraping?" - "Would you ever reveal exact ranks or percentages?" - "When do you escalate beyond normal abuse handling?"

Deep-dive hint

- Assume a sophisticated actor uses distributed infrastructure and patient behavior, so simple IP blocking is weak. - End to end, behavioral analytics correlate systematic ranking and coverage questions, extraction-risk scoring rises, the response layer abstracts away sensitive business signals, and high-risk sessions get degraded detail rather than precise metrics. - Strong answer: protect proprietary signals, accept that some public information is harvestable, and escalate to leadership or legal if the activity is strategically sensitive. - The main architectural control is output shaping plus behavior analytics, not perfect detection. - See [06-ml-specific-threats.md](06-ml-specific-threats.md)

Q33 - Design a 'security chaos engineering' framework for MangaAssist. What experiments would you run and what would you learn?

Follow-up questions interviewer may ask - "How do you keep chaos tests safe?" - "Which experiments have the highest value?" - "What metrics show success or failure?"

Deep-dive hint

- Chaos engineering is controlled validation of assumptions, not random breakage. - End to end, you define a hypothesis, bound blast radius, arm rollback triggers, run the experiment, watch metrics, and convert findings into engineering work. - High-value experiments include guardrail disable, stale Lambda state, synthetic PII generation, KMS unavailability, and prompt-injection replay. - Strong answer always names the learning objective per experiment, not just the experiment itself. - See [03-guardrails-pipeline-deep-dive.md](03-guardrails-pipeline-deep-dive.md), [05-incident-response-forensics.md](05-incident-response-forensics.md), and [08-encryption-key-management.md](08-encryption-key-management.md)

Q34 - What's the minimum security investment for MVP versus what you'd add for a production system serving millions of users?

Follow-up questions interviewer may ask - "Which controls are non-negotiable at MVP?" - "What would you postpone?" - "How do you pick the next security dollar to spend?"

Deep-dive hint

- Use risk-based staging. MVP needs controls that stop obvious user harm, not every advanced forensic capability. - End to end, MVP should include TLS, at-rest encryption, session isolation, basic PII filtering, basic toxicity filtering, rate limiting, and operational logging. Production scale adds field-level encryption, richer guardrails, immutable audit evidence, bias monitoring, supply-chain controls, and multi-region resilience. - Strong answer shows prioritization, not perfectionism. - Close with cost-versus-risk framing and the idea that later controls should target the next-biggest expected loss. - See [14-mvp-vs-future.md](../14-mvp-vs-future.md)

Q35 - How would you architect this system to survive a compromised dependency (e.g., a supply chain attack injects malicious code into LangChain)?

Follow-up questions interviewer may ask - "What stops malicious code from calling home?" - "Why doesn't dependency compromise mean instant PII compromise?" - "How do you recover?"

Deep-dive hint

- The design goal is containment: a bad dependency should not automatically become a full data exfiltration channel. - End to end, the library runs inside a Lambda with scoped IAM, no arbitrary internet egress, limited decrypt ability, exact version pinning, SBOMs, and immutable rollback artifacts. - Monitoring matters too: VPC flow logs, GuardDuty, and anomaly alerts help reveal unexpected behavior.

flowchart LR
    BadLib[Compromised dependency] --> Lambda[Lambda runtime]
    Lambda --> IAM[Scoped IAM]
    Lambda --> Net[No arbitrary egress]
    Net --> Endpoints[Approved AWS endpoints only]
    Monitor[Flow logs / GuardDuty] --> Alert[Anomaly alert]
    Alert --> Rollback[Rollback]

- Strong answer distinguishes convenience libraries from security-critical logic that you should own more directly. - See [07-third-party-supply-chain-risk.md](07-third-party-supply-chain-risk.md)

Deep Dive - Architect Level

Q36 - Design a multi-tenant security architecture if MangaAssist needs to support multiple Amazon storefronts (JP, US, EU) with different regulatory requirements.

Follow-up questions interviewer may ask - "What stays shared and what must be isolated?" - "How do you enforce data residency?" - "How do locale-specific guardrails fit one platform?"

Deep-dive hint

- Use a shared control plane only where safe and separate regional data planes where regulation or risk requires it. - End to end, traffic routes by storefront to a regional stack, customer data remains in-region, locale-aware privacy and guardrail policies apply locally, and only anonymized aggregates move across regions. - Strong answer mentions that regulation is only one driver; locale-specific detectors, consent rules, and toxicity calibration vary too.

flowchart TB
    Router[Global router] --> US[US stack]
    Router --> JP[JP stack]
    Router --> EU[EU stack]
    US --> USD[(US data)]
    JP --> JPD[(JP data)]
    EU --> EUD[(EU data)]
    USD --> Agg[Anonymized aggregates]
    JPD --> Agg
    EUD --> Agg

Q37 - If you had to prove to a board of directors that MangaAssist is 'secure enough' to handle customer data, what evidence would you present?

Follow-up questions interviewer may ask - "What metrics matter to a board?" - "What gaps would you admit?" - "How do you show progress over time?"

Deep-dive hint

- Present evidence in categories: architecture controls, operational metrics, vulnerability management, incident response performance, audit integrity, and residual risk. - A board-ready answer translates technical controls into business outcomes such as lower leak probability, faster containment, and reduced regulatory exposure. - Strong answer is honest about residual risk and improvement trajectory, not just a list of controls. - Mention that postmortems and trend lines are often more convincing than one-time snapshots. - See [05-incident-response-forensics.md](05-incident-response-forensics.md), [07-third-party-supply-chain-risk.md](07-third-party-supply-chain-risk.md), and [08-encryption-key-management.md](08-encryption-key-management.md)

Q38 - How do you balance the cost of security infrastructure ($2K/month) against the risk of a data breach? How do you justify the investment?

Follow-up questions interviewer may ask - "Which controls have highest ROI?" - "What would you cut last?" - "How do you compare monthly spend to breach cost?"

Deep-dive hint

- Frame the answer as expected-loss reduction, not as raw infrastructure spend. - End to end, estimate the likely cost of privacy incidents, downtime, legal work, and trust damage, then compare those to the recurring cost of controls such as KMS separation, VPC endpoints, and guardrail infrastructure. - Strong answer ranks controls by risk reduction per dollar and admits that some advanced controls can wait while high-ROI basics cannot. - Close with the asymmetry: even one meaningful incident often costs more than a year of preventative controls. - See [14-mvp-vs-future.md](../14-mvp-vs-future.md)

Q39 - How would you evolve the security architecture if MangaAssist moves from read-only (answering questions) to write-capable (adding to cart, processing returns)?

Follow-up questions interviewer may ask - "What new guardrail becomes mandatory?" - "How do you prevent duplicate or unintended writes?" - "Why is prompt injection more dangerous now?"

Deep-dive hint

- The fundamental shift is that model mistakes can now change state, not just produce bad text. - End to end, the user expresses an action, the system requests explicit confirmation, revalidates authorization, attaches an idempotency key, executes the action, and records a dedicated action audit event. - Strong answer names four must-haves: confirmation, authorization, idempotency, and reversibility where possible.

flowchart LR
    Ask[User asks for action] --> Confirm[Explicit confirmation]
    Confirm --> Auth[Re-check auth]
    Auth --> Idem[Idempotency key]
    Idem --> Exec[Execute action]
    Exec --> Audit[Action audit log]

- Close with the attack surface statement: prompt injection becomes materially more dangerous once the bot can spend money or modify orders.

Q40 - Design the security testing pyramid for MangaAssist - what do you test at each layer, and what can't be tested automatically?

Follow-up questions interviewer may ask - "Which layer catches the most bugs cheaply?" - "What runs daily versus monthly?" - "Why do you still need human red-teaming?"

Deep-dive hint

- Use a layered testing model: unit, integration, adversarial, chaos, then human evaluation for what automation still misses. - End to end, unit tests catch deterministic logic, integration tests validate workflow boundaries, adversarial suites replay attack cases, chaos validates operational assumptions, and humans test novel prompt injection, bias, and compliance interpretation. - Strong answer should explicitly say what cannot be fully automated: new attack creativity, nuanced bias judgment, and legal interpretation.

flowchart TD
    Unit[Unit tests]
    Int[Integration tests]
    Adv[Adversarial suites]
    Chaos[Chaos experiments]
    Human[Human red-team]
    Unit --> Int --> Adv --> Chaos --> Human

- Close with cadence and ownership, because a testing pyramid is an operating model, not only a diagram.

9. Security, Privacy & Guardrails — Interview Scenarios

Basic Level (5 Questions)

Q1 — Junior Security Engineer

Q2 — QA Engineer

Q3 — Product Manager

Q4 — DevOps Engineer

Q5 — Junior Developer

Medium Level (8 Questions)

Q6 — Security Engineer

Q7 — Backend Engineer

Q8 — Data Engineer

Q9 — SRE

Q10 — ML Engineer

Q11 — Security Engineer

Q12 — Platform Engineer

Q13 — Compliance Officer

Hard Level (8 Questions)

Q14 — Senior Security Engineer

Q15 — Principal Engineer

Q16 — Security Architect

Q17 — Staff Engineer

Q18 — Senior ML Engineer

Q19 — Security Engineer

Q20 — DevSecOps Engineer

Q21 — Infrastructure Engineer

Very Hard Level (8 Questions)

Q22 — Principal Security Architect

Q23 — VP Engineering

Q24 — Senior Security Engineer

Q25 — Data Protection Officer

Q26 — Security Architect

Q27 — Staff Engineer

Q28 — Principal Engineer

Q29 — Security Lead

Super Hard Level (6 Questions)

Q30 — Distinguished Engineer

Q31 — Chief Security Officer

Q32 — Principal Security Architect

Q33 — Distinguished Security Engineer

Q34 — VP Security

Q35 — Principal Engineer

Architect Level (5 Questions)

Q36 — Chief Architect

Q37 — Distinguished Architect

Q38 — VP Engineering

Q39 — Chief Architect

Q40 — Distinguished Security Architect

How to Use These Questions

Cross-References

Deep Dive Companion

Deep Dive - Basic Level

Q1 - What's the difference between encrypting data at rest versus in transit, and why does MangaAssist need both?

Q2 - How would you test whether the PII filter is actually catching sensitive data? What test cases would you prioritize?

Q3 - Why can't the chatbot just answer any question? What business reason do we have for scope restrictions?

Q4 - If a guardrail starts blocking too many legitimate requests, how would you know and what would you do?

Q5 - What happens to a user's conversation data after they close the browser? How long do we keep it?

Deep Dive - Medium Level

Q6 - How does the toxicity filter handle manga titles that contain violent or explicit-sounding words like 'Chainsaw Man', 'Attack on Titan', or 'Death Note'?

Q7 - How do you prevent cross-session data leakage in a serverless architecture where Lambda containers get reused?

Q8 - How would you design the PII detection pipeline to handle both US and Japanese customer data formats?

Q9 - What's your incident severity classification? Walk me through what makes something SEV-1 versus SEV-3.

Q10 - How would you detect if someone is systematically querying MangaAssist to reverse-engineer the recommendation algorithm?

Q11 - Walk me through how you handle Unicode homoglyph attacks against your text classifiers.

Q12 - Why did you choose VPC endpoints instead of NAT Gateway for Lambda-to-AWS-service communication?

Q13 - How do you demonstrate to auditors that audit logs haven't been tampered with?

Deep Dive - Hard Level

Q14 - How do you protect the RAG pipeline from data poisoning through manipulated product reviews?

Q15 - The guardrail pipeline runs 6 stages serially in 14-27ms. You considered parallel execution. Walk me through the tradeoffs.

Q16 - How do you handle a scenario where a prompt change passes all quality tests but creates a new attack vector?

Q17 - How do you balance rate limiting between preventing abuse and not degrading experience for legitimate high-volume users like Prime members?

Q18 - How would you detect and mitigate bias in manga recommendations? Give specific examples of bias types.

Q19 - A user reports 'the bot told me about someone else's order.' Walk me through your first 30 minutes.

Q20 - How do you handle a critical CVE in a transitive dependency that doesn't directly apply to your use case?

Q21 - Why do you use three separate KMS keys instead of one? What's the operational cost?

Deep Dive - Very Hard Level

Q22 - How would you design the system so that even a compromised Lambda function cannot exfiltrate PII at scale?

Q23 - If we discover that our FM has been producing subtly biased recommendations for 3 months, what's your technical and organizational response plan?

Q24 - How do you defend against multi-turn prompt injection where the attack is spread across 10+ conversational turns?

Q25 - Walk me through exactly what happens when a GDPR right-to-deletion request arrives. Every system, every data store.

Q26 - How would you redesign the guardrail pipeline if you needed sub-5ms total latency for a voice interface?