Embedding Retrieval Scenarios - MangaAssist

This companion document turns the embedding fine-tuning topic into concrete MangaAssist scenarios. It assumes MangaAssist already uses a fine-tuned intent classifier for routing and then calls retrieval for product discovery, recommendations, product questions, and FAQ grounding.

When This Topic Matters

Use embedding fine-tuning when the chatbot understands the user's intent but retrieves the wrong manga, help article, or catalog cluster.

Symptom	Example user message	Likely failure
Similar titles are missed	"dark fantasy like Berserk but less brutal"	embedding space overweights popularity
Tone is missed	"cozy slice of life manga with found family"	genre tags are too shallow
Japanese/English phrasing drifts	"iyashikei manga for stress relief"	multilingual domain phrase not close to catalog tags
Support retrieval is weak	"why did preorder shipping split my order?"	FAQ article not near the query

Scenario 1 - Catalog Search Adapter

MangaAssist keeps Titan embeddings as the base encoder and trains a small projection adapter on top of 1024-dimensional vectors.

Training data:

Source	Count	Positive signal	Negative signal
search clicks	18,000	clicked product	skipped but shown result
add-to-cart events	4,000	purchased or wishlisted title	high-rank non-click
editorial pairs	2,000	curated similar title	same broad genre, wrong tone
synthetic pairs	3,000	generated from metadata	filtered by reviewer

Training objective:

loss = -log exp(sim(query, positive) / tau)
       / sum(exp(sim(query, candidate_i) / tau))

Recommended first run:

Setting	Value
Adapter	1024 -> 512 -> 1024 MLP with LayerNorm
Loss	InfoNCE
Temperature	0.07
Batch size	128
Hard negatives	7 per query
Epochs	4

Promotion gate:

Metric	Baseline	Candidate gate
Recall@3	0.68	>= 0.80
NDCG@10	0.71	>= 0.78
Bad top-1 rate	14%	<= 8%
Embedding latency add-on	0 ms	<= 2 ms

Scenario 2 - Query Rewriting Feedback Loop

Some failures are caused by query wording, not the catalog vector. Keep a rejected-query bucket from the intent classifier and retrieval layer:

top retrieval score below threshold,
user reformulates within 30 seconds,
user clicks none of the top 10 results,
user uses manga-specific slang or romanized Japanese.

Cluster those failures and create new training pairs. Example cluster:

"slow burn rivals to lovers"
"enemies to lovers manga but not fantasy"
"rivals become couple manga"

Action: add positives from romance subgenre metadata and hard negatives from action-rivalry titles without romance.

Scenario 3 - Metadata-Aware Retrieval

For titles with sparse descriptions, create enriched product text:

title + author + demographic + genre + tone + themes + age rating + availability

Fine-tune the adapter on the enriched text, but evaluate on live user queries. This prevents the model from winning only on artificial metadata similarity.

Production Logs To Add

{
  "event": "retrieval_eval",
  "query": "dark fantasy like Berserk but less brutal",
  "intent": "recommendation",
  "top3_titles": ["Claymore", "Vinland Saga", "Vagabond"],
  "recall_at_3_hit": true,
  "adapter_version": "embed-adapter-v03",
  "latency_ms": 3.1
}

Failure Modes

Failure	Detection	Fix
popularity collapse	top results repeat best sellers	add popularity-balanced negatives
synthetic drift	synthetic queries outperform real logs	cap synthetic share below 20%
metadata leakage	eval positives share exact tags	evaluate on held-out real queries
over-narrow retrieval	same subgenre only	add diverse positives within user taste

Final Decision

Ship an embedding adapter only if it improves real user retrieval without adding enough latency to hurt the 15 ms intent-routing path or downstream response time. For MangaAssist, Recall@3 and NDCG@10 matter more than raw embedding loss.