LOCAL PREVIEW View on GitHub

Storytelling Guide - System Prompt Extraction via Hypothetical Framing Follow-Up Questions

Source document: 10-storytelling-guide.md Reference scenario: 01-prompt-injection-defense.md -> Scenario 3: System Prompt Extraction via Hypothetical Framing

Scenario lens: Non-literal extraction attempts that use hypothetical, creative, translated, or reframed prompts to expose confidential rules. Document lens: security storytelling, STAR-D structure, and audience framing.

Use these prompts to push past the base scenario and explore deeper design, operational, interview, or storytelling tradeoffs.

Answer document: ANSWERS.md

Easy

  1. What is the most compelling hook for a story about hypothetical prompt extraction without sounding alarmist?
  2. Which consequence should you foreground first: secret leakage, policy leakage, or attacker learning velocity, and why?

Medium

  1. How would you explain the attacker technique in plain language before shifting into classifier design or compartmentalization?
  2. Which single result metric best proves the story had a measurable outcome rather than a cosmetic prompt tweak?

Hard

  1. What decision or tradeoff would demonstrate mature judgment if you chose compartmentalization over simply making the prompt longer and stricter?
  2. How would you keep the story credible if some of the leakage was partial or inferential instead of a clean verbatim dump?

Very Hard

  1. How would you narrate this scenario if the team disagreed on whether the leak was severe enough to count as an incident?
  2. What supporting evidence would you keep ready in case the listener asks how you know the attack was blocked rather than merely changing form?