Prompt Engineering Interview Pack - Hard With Hints
Level: Hard How to use: Use the hints to push your answer from prompt tactics to system-level reasoning.
Interview Questions With Hints
Staff Engineer
- A prompt says "never invent prices" but the model still sometimes attaches the wrong price to a recommendation. What would you do?
Hint: Remove price narration from the FM role and let the UI bind validated catalog prices after generation.
- The model keeps returning malformed JSON even after stronger format instructions. How would you fix the system?
Hint: Prompt contract plus parser validation, repair logic, and safe fallback templates.
- Recommendation prompts became much better after adding many examples, but latency doubled. How would you recover?
Hint: Compress exemplars, reduce example count, move quality gains upstream through better candidate ranking and editorial grounding.
Principal Engineer
- How would you design prompts and workflow boundaries for a user message that mixes order support and recommendations in one turn?
Hint: Do not let one free-form answer own both workflows loosely. Split intents, answer operational facts first, and isolate recommendation sections.
- How would you harden prompts against injection attempts without making the assistant over-refuse normal questions?
Hint: Separate trusted instructions from user text, define allowed sources, and combine prompt hardening with source filtering and output validation.
- What would you change if long conversation history keeps crowding out the retrieved evidence in the final prompt?
Hint: Compress history by task relevance, keep explicit preference slots, and protect grounding budget.
SRE
- If FM timeout rates rise during peak traffic, what prompt changes help, and what changes belong outside the prompt?
Hint: Prompt trimming helps only part of the problem. Also discuss model tiering, routing away from FM when possible, and graceful degradation.
- How would you distinguish a prompt problem from a retrieval problem when FAQ answers become vague or inconsistent?
Hint: Inspect retrieved chunks, freshness, and chunk precision before rewriting the system prompt.
Applied Scientist
- In what cases is prompt optimization the wrong first response to a model-quality problem?
Hint: When the real issue is routing, retrieval, stale data, missing validation, or an overloaded model path.
- How would you evaluate whether a prompt change improved trust instead of only improving style?
Hint: Use grounding accuracy, escalation precision, thumbs-down rate, and hallucination-related failure cases, not just human preference for nicer wording.