LOCAL PREVIEW View on GitHub

Analytics, Observability, And Feedback Loops

Covers Q24, Q37, Q40, Q47, Q49.

What The Interviewer Is Testing

  • Whether you can instrument the system well enough to operate it.
  • Whether you understand schema evolution and prompt experimentation.
  • Whether you can connect traces, metrics, logs, and business KPIs.

Deep Dive

Event Design Principles

  • Emit structured events with stable identifiers such as session_id, response_id, and prompt version.
  • Scrub PII before emission.
  • Treat analytics schemas as versioned contracts.

Observability Stack

You should be able to describe three layers clearly:

  • traces for request path and dependency timing
  • metrics for SLOs, error rates, and saturation
  • logs for structured debugging context

What To Trace

  • request entry
  • session load
  • intent classification
  • service fan-out
  • retrieval and reranking
  • LLM invocation
  • guardrail stages
  • persistence and analytics emission

Prompt A/B Testing

Prompt versions should be configuration, not code constants. Strong answers mention:

  • consistent assignment by customer or session key
  • prompt version logging on every response
  • outcome metrics such as resolution, satisfaction, conversion, and escalation

Schema Evolution

New event fields should be additive and backward-compatible. The strongest answers mention a schema registry, nullable additions, consumer compatibility, and a version log.

Strong Answer Pattern

  • "If I cannot trace it, I cannot tune it."
  • "Prompt experiments need consistent assignment and analytics tagging."
  • "Dashboards should connect technical latency to user outcomes."

Scenario 1: Add user_satisfaction_score

Primary Prompt

The product team wants a new user_satisfaction_score field in analytics events. How do you add it safely?

Follow-Up 1

How do you keep old consumers from breaking?

Follow-Up 2

What change would be needed in Redshift?

Follow-Up 3

Would you backfill historical data? Under what conditions?

Strong Answer Markers

  • Uses versioned additive schema changes.
  • Keeps new fields optional at first.
  • Mentions warehouse evolution and optional backfill.

Scenario 2: No One Knows Where The Latency Went

Primary Prompt

The team only logs total response time, and now latency is high. What instrumentation is missing?

Follow-Up 1

What span boundaries would you add first?

Follow-Up 2

Which metrics belong on an operations dashboard versus an executive dashboard?

Follow-Up 3

What alarm thresholds would you set initially?

Strong Answer Markers

  • Adds span-level tracing across every major stage.
  • Distinguishes business dashboards from engineering dashboards.
  • Uses p95 and p99, not only averages.

Scenario 3: Prompt Version B Improves Satisfaction But Hurts Escalation Rate

Primary Prompt

Prompt version B raises thumbs-up rates but also increases escalations. How do you reason about that conflict?

Follow-Up 1

What additional metrics do you inspect before deciding?

Follow-Up 2

Could the prompt be overconfident while sounding better?

Follow-Up 3

What is your rollout decision if the metrics remain mixed?

Strong Answer Markers

  • Treats multi-metric outcomes as normal.
  • Investigates segment-level behavior and root causes.
  • Uses business-priority weighting instead of picking the nicest-looking metric.

Red Flags

  • Logging raw PII to "fix it later."
  • Treating dashboards as only SQL queries.
  • Running prompt experiments without version IDs in analytics.
  • Relying on average latency as the primary health signal.

Two-Minute Whiteboard Version

Draw four outputs from the same request:

  1. Trace spans.
  2. Metrics.
  3. Structured logs.
  4. Analytics events tied to prompt and response IDs.