LOCAL PREVIEW View on GitHub

MangaAssist Interview Pack - Hard

Level: Hard
What this tests: Failure handling, low-level reasoning, latency tradeoffs, safety boundaries, and implementation decisions under production constraints.

Failure and Degradation Map

graph TD
    A[User Request] --> B{Intent Type}
    B --> C[Structured API Path]
    B --> D[RAG Path]
    B --> E[Full LLM Path]

    C --> F{Dependency healthy?}
    D --> G{Retriever healthy?}
    E --> H{LLM healthy?}

    F -->|No| I[Fallback response]
    G -->|No| J[Reduced grounding or safe fallback]
    H -->|No| K[Unavailable message or escalation]

    F -->|Yes| L[Guardrails]
    G -->|Yes| L
    H -->|Yes| L

    L --> M{Safe and complete?}
    M -->|Yes| N[Return answer]
    M -->|No| O[Regenerate / redact / escalate]

Interview Questions

Staff Engineer

  1. The system targets low latency but also uses multiple downstream services. Where would you parallelize, and where would you keep the flow sequential?
  2. What would you do if conversation-memory reads become slow enough to threaten the response SLA?
  3. How would you design timeouts and retries for recommendation, catalog, and order-service calls so the chatbot does not stall?
  4. Why is summarizing older turns better than simply truncating conversation history in this system?

Security Engineer

  1. Walk through how you would defend against a prompt-injection attempt that asks the model to ignore the system prompt and reveal internal rules.
  2. Where should PII scrubbing happen, and what goes wrong if it only happens in the analytics pipeline?

ML Engineer

  1. How would you detect that the intent classifier is drifting and misrouting messages more often over time?
  2. If retrieved chunks conflict with each other, how should the system respond without sounding uncertain or misleading?

SRE

  1. If Bedrock starts timing out for 5 percent of calls during a traffic spike, what immediate mitigations would you apply first?

Principal Engineer

  1. If you had to cut p99 first-token latency in half without materially hurting answer quality, what three changes would you try first and why?

Low-Level Recall

stateDiagram-v2
    [*] --> Receive
    Receive --> LoadContext
    LoadContext --> ClassifyIntent
    ClassifyIntent --> Route
    Route --> FetchData
    FetchData --> Generate
    Generate --> GuardrailChecks
    GuardrailChecks --> SaveTurn
    SaveTurn --> ReturnResponse
    GuardrailChecks --> Escalate: unsafe or unresolved
    Escalate --> [*]
    ReturnResponse --> [*]