Interview Scenarios - System Prompt Extraction via Hypothetical Framing Follow-Up Questions
Source document: 09-interview-scenarios.md Reference scenario: 01-prompt-injection-defense.md -> Scenario 3: System Prompt Extraction via Hypothetical Framing
Scenario lens: Non-literal extraction attempts that use hypothetical, creative, translated, or reframed prompts to expose confidential rules. Document lens: interview preparation, persona-aware answers, and depth progression.
Use these prompts to push past the base scenario and explore deeper design, operational, interview, or storytelling tradeoffs.
Answer document: ANSWERS.md
Easy
- If asked about hypothetical prompt extraction, what is the shortest answer that still makes clear why regex-only defenses fail?
- Which concrete refusal pattern or compartmentalization decision would you name first to show practical depth?
Medium
- How would your answer differ if the interviewer cares more about privacy leakage than about model jailbreaks?
- What metric or test suite result would you cite to prove the mitigation worked beyond anecdotal red-team wins?
Hard
- How would you explain the tradeoff between keeping system behavior configurable and minimizing the damage of partial prompt leakage?
- What mistake would weaken your credibility if the interviewer asks whether secrets should ever live in prompts at all?
Very Hard
- How would you handle a skeptical interviewer who argues that no extraction defense is trustworthy unless you prove non-reconstructability?
- If the interviewer asks for a second example on the spot, how would you pivot to a related document without sounding like you memorized isolated stories?