Storytelling Guide - System Prompt Extraction via Hypothetical Framing Follow-Up Questions
Source document: 10-storytelling-guide.md Reference scenario: 01-prompt-injection-defense.md -> Scenario 3: System Prompt Extraction via Hypothetical Framing
Scenario lens: Non-literal extraction attempts that use hypothetical, creative, translated, or reframed prompts to expose confidential rules. Document lens: security storytelling, STAR-D structure, and audience framing.
Use these prompts to push past the base scenario and explore deeper design, operational, interview, or storytelling tradeoffs.
Answer document: ANSWERS.md
Easy
- What is the most compelling hook for a story about hypothetical prompt extraction without sounding alarmist?
- Which consequence should you foreground first: secret leakage, policy leakage, or attacker learning velocity, and why?
Medium
- How would you explain the attacker technique in plain language before shifting into classifier design or compartmentalization?
- Which single result metric best proves the story had a measurable outcome rather than a cosmetic prompt tweak?
Hard
- What decision or tradeoff would demonstrate mature judgment if you chose compartmentalization over simply making the prompt longer and stricter?
- How would you keep the story credible if some of the leakage was partial or inferential instead of a clean verbatim dump?
Very Hard
- How would you narrate this scenario if the team disagreed on whether the leak was severe enough to count as an incident?
- What supporting evidence would you keep ready in case the listener asks how you know the attack was blocked rather than merely changing form?