Prompt Engineering Interview Pack - Hard With Hints

Level: Hard How to use: Use the hints to push your answer from prompt tactics to system-level reasoning.

Interview Questions With Hints

Staff Engineer

A prompt says "never invent prices" but the model still sometimes attaches the wrong price to a recommendation. What would you do?

Hint: Remove price narration from the FM role and let the UI bind validated catalog prices after generation.

The model keeps returning malformed JSON even after stronger format instructions. How would you fix the system?

Hint: Prompt contract plus parser validation, repair logic, and safe fallback templates.

Recommendation prompts became much better after adding many examples, but latency doubled. How would you recover?

Hint: Compress exemplars, reduce example count, move quality gains upstream through better candidate ranking and editorial grounding.

Principal Engineer

How would you design prompts and workflow boundaries for a user message that mixes order support and recommendations in one turn?

Hint: Do not let one free-form answer own both workflows loosely. Split intents, answer operational facts first, and isolate recommendation sections.

How would you harden prompts against injection attempts without making the assistant over-refuse normal questions?

Hint: Separate trusted instructions from user text, define allowed sources, and combine prompt hardening with source filtering and output validation.

What would you change if long conversation history keeps crowding out the retrieved evidence in the final prompt?

Hint: Compress history by task relevance, keep explicit preference slots, and protect grounding budget.

SRE

If FM timeout rates rise during peak traffic, what prompt changes help, and what changes belong outside the prompt?

Hint: Prompt trimming helps only part of the problem. Also discuss model tiering, routing away from FM when possible, and graceful degradation.

How would you distinguish a prompt problem from a retrieval problem when FAQ answers become vague or inconsistent?

Hint: Inspect retrieved chunks, freshness, and chunk precision before rewriting the system prompt.

Applied Scientist

In what cases is prompt optimization the wrong first response to a model-quality problem?

Hint: When the real issue is routing, retrieval, stale data, missing validation, or an overloaded model path.

How would you evaluate whether a prompt change improved trust instead of only improving style?

Hint: Use grounding accuracy, escalation precision, thumbs-down rate, and hallucination-related failure cases, not just human preference for nicer wording.