LOCAL PREVIEW View on GitHub

Training MLOps Scenarios - MangaAssist

This companion document turns the training infrastructure topic into a MangaAssist operating playbook. The goal is repeatable fine-tuning, validation, release, monitoring, and rollback across the full chatbot model stack.

Models In Scope

Model Cadence Main gate
intent classifier weekly or drift-triggered accuracy, rare recall, business harm, latency
embedding adapter monthly Recall@3, NDCG@10
cross-encoder reranker monthly MRR@10, latency
sentiment detector weekly escalation recall
LoRA/DPO response model quarterly or quality-triggered human preference and factuality
RAFT answer model policy/catalog-triggered grounded accuracy

Scenario 1 - Weekly Router Release

Pipeline:

flowchart TD
    A[Collect labeled logs] --> B[Data validation]
    B --> C[Train candidate]
    C --> D[Offline eval]
    D --> E[Latency test]
    E --> F[Shadow deploy]
    F --> G{Promotion gates pass?}
    G -- yes --> H[Blue-green release]
    G -- no --> I[Keep champion and open error review]

Required artifacts:

  • dataset version,
  • tokenizer hash,
  • training config,
  • model artifact,
  • eval report,
  • confusion matrix,
  • latency report,
  • rollback pointer.

Promotion gate:

Gate Rule
overall accuracy candidate >= champion - 0.2 points
rare-class accuracy no critical regression
business-weighted harm improves or remains within threshold
escalation miss rate no increase
P95 latency under 15 ms
shadow disagreement reviewed if over threshold

Scenario 2 - Cross-Model Release Coordination

The embedding adapter and reranker should not be released independently if their metrics are tightly coupled.

Example:

  • embedding adapter changes candidate distribution,
  • reranker was trained on old candidate distribution,
  • final top-3 quality drops despite separate offline wins.

Solution:

  • evaluate retrieval plus reranking together,
  • version compatible model pairs,
  • run end-to-end catalog search validation.

Scenario 3 - Model Registry Discipline

Every MangaAssist model should have a champion/challenger state.

Registry fields:

{
  "model_name": "intent-distilbert",
  "version": "v15",
  "dataset_version": "intent-data-2026-04-20",
  "training_code_sha": "abc123",
  "metrics": {
    "accuracy": 0.922,
    "rare_accuracy": 0.889,
    "p95_latency_ms": 12.1
  },
  "status": "shadow"
}

Failure Modes

Failure Detection Fix
train/serve skew offline pass, live fail tokenizer and preprocessing hash checks
silent data leakage validation too good group split by conversation/user
untracked manual model no reproducibility registry required for deployment
shadow ignored bad model promoted promotion checklist enforced

Final Decision

For MangaAssist, MLOps is part of model quality. A model is not done when it trains; it is done when it has a traceable dataset, repeatable pipeline, measurable gates, shadow evidence, and rollback safety.