LOCAL PREVIEW View on GitHub

Mixture of Experts Scenarios - MangaAssist

Mixture of Experts, or MoE, routes each request to one or more specialized expert models. For MangaAssist, MoE is useful when different traffic types need different behavior: manga recommendations, support policies, checkout help, and human escalation are not the same task.

When This Topic Matters

Use MoE when one model shows conflicting behavior across workflows:

  • recommendation responses need rich taste matching,
  • support responses need strict procedure,
  • escalation requires safe handoff language,
  • genre-heavy queries need domain nuance,
  • chitchat should stay short and not trigger expensive retrieval.

Scenario 1 - Workflow Expert Routing

Experts:

Expert Trigger Model behavior
catalog_reco product discovery, recommendation taste matching and title comparison
product_qa product questions metadata-grounded answers
commerce_support order, return, checkout policy-grounded procedural answers
escalation_safety frustration or human handoff empathetic escalation
smalltalk chitchat lightweight conversational response

Router inputs:

  • intent classifier probabilities,
  • sentiment/frustration score,
  • retrieval confidence,
  • top-2 intent margin,
  • business risk score.

Promotion gate:

Metric Gate
routing accuracy by workflow >= 95%
expert load imbalance no expert over 45% unless expected
high-risk misroute rate <= 0.2%
cost per conversation <= champion + 5%

Scenario 2 - Genre Expert Adapters

For recommendation-heavy traffic, the router can select genre adapters:

  • shonen,
  • shojo,
  • seinen,
  • josei,
  • horror,
  • romance,
  • slice-of-life,
  • isekai.

Keep this behind a strong fallback. If genre confidence is low, use the general recommendation expert rather than forcing a niche expert.

Scenario 3 - Sparse Expert Serving

MoE can reduce cost if only one or two experts run per request.

Decision rule:

if business_risk_high:
    run support/escalation expert
elif intent_confident and genre_confident:
    run one genre expert
else:
    run general expert

Failure Modes

Failure Detection Fix
wrong expert selected good model gives wrong workflow improve router and use fallback thresholds
load collapse one expert gets all traffic add load balancing loss
expert disagreement two experts conflict add arbitration policy
too much complexity ops overhead exceeds gain use simpler adapter routing

Production Log

{
  "event": "moe_route",
  "intent": "recommendation",
  "genre_hint": "seinen",
  "selected_experts": ["catalog_reco", "seinen_adapter"],
  "router_confidence": 0.88,
  "business_risk": 0.12,
  "latency_ms": 917
}

Final Decision

MoE is powerful for MangaAssist only after simpler routing, LoRA adapters, and calibrated intent probabilities are mature. Use it when specialization clearly improves quality without making the production system hard to reason about.