Mixture of Experts Scenarios - MangaAssist

Mixture of Experts, or MoE, routes each request to one or more specialized expert models. For MangaAssist, MoE is useful when different traffic types need different behavior: manga recommendations, support policies, checkout help, and human escalation are not the same task.

When This Topic Matters

Use MoE when one model shows conflicting behavior across workflows:

recommendation responses need rich taste matching,
support responses need strict procedure,
escalation requires safe handoff language,
genre-heavy queries need domain nuance,
chitchat should stay short and not trigger expensive retrieval.

Scenario 1 - Workflow Expert Routing

Experts:

Expert	Trigger	Model behavior
`catalog_reco`	product discovery, recommendation	taste matching and title comparison
`product_qa`	product questions	metadata-grounded answers
`commerce_support`	order, return, checkout	policy-grounded procedural answers
`escalation_safety`	frustration or human handoff	empathetic escalation
`smalltalk`	chitchat	lightweight conversational response

Router inputs:

intent classifier probabilities,
sentiment/frustration score,
retrieval confidence,
top-2 intent margin,
business risk score.

Promotion gate:

Metric	Gate
routing accuracy by workflow	>= 95%
expert load imbalance	no expert over 45% unless expected
high-risk misroute rate	<= 0.2%
cost per conversation	<= champion + 5%

Scenario 2 - Genre Expert Adapters

For recommendation-heavy traffic, the router can select genre adapters:

shonen,
shojo,
seinen,
josei,
horror,
romance,
slice-of-life,
isekai.

Keep this behind a strong fallback. If genre confidence is low, use the general recommendation expert rather than forcing a niche expert.

Scenario 3 - Sparse Expert Serving

MoE can reduce cost if only one or two experts run per request.

Decision rule:

if business_risk_high:
    run support/escalation expert
elif intent_confident and genre_confident:
    run one genre expert
else:
    run general expert

Failure Modes

Failure	Detection	Fix
wrong expert selected	good model gives wrong workflow	improve router and use fallback thresholds
load collapse	one expert gets all traffic	add load balancing loss
expert disagreement	two experts conflict	add arbitration policy
too much complexity	ops overhead exceeds gain	use simpler adapter routing

Production Log

{
  "event": "moe_route",
  "intent": "recommendation",
  "genre_hint": "seinen",
  "selected_experts": ["catalog_reco", "seinen_adapter"],
  "router_confidence": 0.88,
  "business_risk": 0.12,
  "latency_ms": 917
}

Final Decision

MoE is powerful for MangaAssist only after simpler routing, LoRA adapters, and calibrated intent probabilities are mature. Use it when specialization clearly improves quality without making the production system hard to reason about.