Fine-Tuning Scenario Execution Map - MangaAssist

This dry-run map sequences the MangaAssist fine-tuning scenario documents into an execution order. Use it when deciding what to build or train next.

Phase 1 - Diagnose The Failure

Question	Document family
Did the request route to the wrong workflow?	Intent Classification
Was the right manga or article missing from retrieval?	Embedding and Retrieval
Was the right candidate found but ranked too low?	Cross-Encoder Reranker
Was the right context retrieved but ignored?	RAFT
Was the answer factual but off-brand?	Prompt/Prefix, LoRA, DPO
Did quality drift over time?	Continual Learning, Data Curation
Is the model too slow or expensive?	QAT, Distillation

Phase 2 - Pick The Smallest Effective Intervention

Preferred order:

data cleanup and evaluation fix,
calibration, threshold, or routing policy,
lightweight adapter or prompt/prefix tuning,
task-specific fine-tuning,
preference alignment,
compression or serving optimization,
MoE or larger architectural change.

Phase 3 - Run The Standard Training Checklist

[ ] define failure slice
[ ] collect positive and hard negative examples
[ ] reserve golden set
[ ] choose model and loss
[ ] train candidate
[ ] evaluate global metrics
[ ] evaluate critical slices
[ ] measure latency and cost
[ ] inspect errors
[ ] shadow deploy
[ ] promote or rollback

Phase 4 - MangaAssist Release Gates

Gate	Why it matters
user-visible quality	chatbot must feel better, not just score better
business-weighted harm	protects high-risk routes
rare-class recall	protects low-volume critical cases
factuality	prevents catalog and policy hallucination
spoiler safety	protects manga experience
latency	keeps chat responsive
cost	keeps the system viable
rollback	lets the team move safely

Example End-To-End Dry Run

Problem:

Users asking for "iyashikei manga" get generic slice-of-life recommendations.

Execution:

Confirm intent is correct: recommendation.
Inspect retrieval: relevant titles missing from top 10.
Add editorial pairs for iyashikei, healing, calm, rural, and found-family themes.
Train embedding adapter with hard negatives from generic comedy slice-of-life.
Evaluate Recall@3 and NDCG@10 on production queries.
Fine-tune reranker only if relevant titles appear in top 50 but rank too low.
Add RAFT examples if final answer fails to explain why the titles match.
Shadow deploy and monitor click-through and no-click rate.

Decision:

If Recall@3 improves from 0.62 to at least 0.78 and no-click rate falls,
promote the embedding adapter. If not, revisit data labels and hard negatives.

Final Takeaway

The execution map keeps MangaAssist from overusing fine-tuning. Every intervention starts from a concrete production failure, chooses the smallest tool that can fix it, and promotes only through measured gates.