Analytics, Observability, And Feedback Loops

Covers Q24, Q37, Q40, Q47, Q49.

What The Interviewer Is Testing

Whether you can instrument the system well enough to operate it.
Whether you understand schema evolution and prompt experimentation.
Whether you can connect traces, metrics, logs, and business KPIs.

Deep Dive

Event Design Principles

Emit structured events with stable identifiers such as session_id, response_id, and prompt version.
Scrub PII before emission.
Treat analytics schemas as versioned contracts.

Observability Stack

You should be able to describe three layers clearly:

traces for request path and dependency timing
metrics for SLOs, error rates, and saturation
logs for structured debugging context

What To Trace

request entry
session load
intent classification
service fan-out
retrieval and reranking
LLM invocation
guardrail stages
persistence and analytics emission

Prompt A/B Testing

Prompt versions should be configuration, not code constants. Strong answers mention:

consistent assignment by customer or session key
prompt version logging on every response
outcome metrics such as resolution, satisfaction, conversion, and escalation

Schema Evolution

New event fields should be additive and backward-compatible. The strongest answers mention a schema registry, nullable additions, consumer compatibility, and a version log.

Strong Answer Pattern

"If I cannot trace it, I cannot tune it."
"Prompt experiments need consistent assignment and analytics tagging."
"Dashboards should connect technical latency to user outcomes."

Scenario 1: Add `user_satisfaction_score`

Primary Prompt

The product team wants a new user_satisfaction_score field in analytics events. How do you add it safely?

Follow-Up 1

How do you keep old consumers from breaking?

Follow-Up 2

What change would be needed in Redshift?

Follow-Up 3

Would you backfill historical data? Under what conditions?

Strong Answer Markers

Uses versioned additive schema changes.
Keeps new fields optional at first.
Mentions warehouse evolution and optional backfill.

Scenario 2: No One Knows Where The Latency Went

Primary Prompt

The team only logs total response time, and now latency is high. What instrumentation is missing?

Follow-Up 1

What span boundaries would you add first?

Follow-Up 2

Which metrics belong on an operations dashboard versus an executive dashboard?

Follow-Up 3

What alarm thresholds would you set initially?

Strong Answer Markers

Adds span-level tracing across every major stage.
Distinguishes business dashboards from engineering dashboards.
Uses p95 and p99, not only averages.

Scenario 3: Prompt Version B Improves Satisfaction But Hurts Escalation Rate

Primary Prompt

Prompt version B raises thumbs-up rates but also increases escalations. How do you reason about that conflict?

Follow-Up 1

What additional metrics do you inspect before deciding?

Follow-Up 2

Could the prompt be overconfident while sounding better?

Follow-Up 3

What is your rollout decision if the metrics remain mixed?

Strong Answer Markers

Treats multi-metric outcomes as normal.
Investigates segment-level behavior and root causes.
Uses business-priority weighting instead of picking the nicest-looking metric.

Red Flags

Logging raw PII to "fix it later."
Treating dashboards as only SQL queries.
Running prompt experiments without version IDs in analytics.
Relying on average latency as the primary health signal.

Two-Minute Whiteboard Version

Draw four outputs from the same request:

Trace spans.
Metrics.
Structured logs.
Analytics events tied to prompt and response IDs.

Analytics, Observability, And Feedback Loops

What The Interviewer Is Testing

Deep Dive

Event Design Principles

Observability Stack

What To Trace

Prompt A/B Testing

Schema Evolution

Strong Answer Pattern

Scenario 1: Add user_satisfaction_score

Primary Prompt

Follow-Up 1

Follow-Up 2

Follow-Up 3

Strong Answer Markers

Scenario 2: No One Knows Where The Latency Went

Primary Prompt

Follow-Up 1

Follow-Up 2

Follow-Up 3

Strong Answer Markers

Scenario 3: Prompt Version B Improves Satisfaction But Hurts Escalation Rate

Primary Prompt

Follow-Up 1

Follow-Up 2

Follow-Up 3

Strong Answer Markers

Red Flags

Two-Minute Whiteboard Version

Scenario 1: Add `user_satisfaction_score`