LOCAL PREVIEW View on GitHub

Embedding Retrieval Scenarios - MangaAssist

This companion document turns the embedding fine-tuning topic into concrete MangaAssist scenarios. It assumes MangaAssist already uses a fine-tuned intent classifier for routing and then calls retrieval for product discovery, recommendations, product questions, and FAQ grounding.

When This Topic Matters

Use embedding fine-tuning when the chatbot understands the user's intent but retrieves the wrong manga, help article, or catalog cluster.

Symptom Example user message Likely failure
Similar titles are missed "dark fantasy like Berserk but less brutal" embedding space overweights popularity
Tone is missed "cozy slice of life manga with found family" genre tags are too shallow
Japanese/English phrasing drifts "iyashikei manga for stress relief" multilingual domain phrase not close to catalog tags
Support retrieval is weak "why did preorder shipping split my order?" FAQ article not near the query

Scenario 1 - Catalog Search Adapter

MangaAssist keeps Titan embeddings as the base encoder and trains a small projection adapter on top of 1024-dimensional vectors.

Training data:

Source Count Positive signal Negative signal
search clicks 18,000 clicked product skipped but shown result
add-to-cart events 4,000 purchased or wishlisted title high-rank non-click
editorial pairs 2,000 curated similar title same broad genre, wrong tone
synthetic pairs 3,000 generated from metadata filtered by reviewer

Training objective:

loss = -log exp(sim(query, positive) / tau)
       / sum(exp(sim(query, candidate_i) / tau))

Recommended first run:

Setting Value
Adapter 1024 -> 512 -> 1024 MLP with LayerNorm
Loss InfoNCE
Temperature 0.07
Batch size 128
Hard negatives 7 per query
Epochs 4

Promotion gate:

Metric Baseline Candidate gate
Recall@3 0.68 >= 0.80
NDCG@10 0.71 >= 0.78
Bad top-1 rate 14% <= 8%
Embedding latency add-on 0 ms <= 2 ms

Scenario 2 - Query Rewriting Feedback Loop

Some failures are caused by query wording, not the catalog vector. Keep a rejected-query bucket from the intent classifier and retrieval layer:

  • top retrieval score below threshold,
  • user reformulates within 30 seconds,
  • user clicks none of the top 10 results,
  • user uses manga-specific slang or romanized Japanese.

Cluster those failures and create new training pairs. Example cluster:

"slow burn rivals to lovers"
"enemies to lovers manga but not fantasy"
"rivals become couple manga"

Action: add positives from romance subgenre metadata and hard negatives from action-rivalry titles without romance.

Scenario 3 - Metadata-Aware Retrieval

For titles with sparse descriptions, create enriched product text:

title + author + demographic + genre + tone + themes + age rating + availability

Fine-tune the adapter on the enriched text, but evaluate on live user queries. This prevents the model from winning only on artificial metadata similarity.

Production Logs To Add

{
  "event": "retrieval_eval",
  "query": "dark fantasy like Berserk but less brutal",
  "intent": "recommendation",
  "top3_titles": ["Claymore", "Vinland Saga", "Vagabond"],
  "recall_at_3_hit": true,
  "adapter_version": "embed-adapter-v03",
  "latency_ms": 3.1
}

Failure Modes

Failure Detection Fix
popularity collapse top results repeat best sellers add popularity-balanced negatives
synthetic drift synthetic queries outperform real logs cap synthetic share below 20%
metadata leakage eval positives share exact tags evaluate on held-out real queries
over-narrow retrieval same subgenre only add diverse positives within user taste

Final Decision

Ship an embedding adapter only if it improves real user retrieval without adding enough latency to hurt the 15 ms intent-routing path or downstream response time. For MangaAssist, Recall@3 and NDCG@10 matter more than raw embedding loss.