Catalog Search MCP — Manga Title & Metadata Retrieval
Purpose
The Catalog Search MCP is the primary discovery surface of the chatbot. It lets Claude search 5M+ manga titles across title, genre, author, publisher, demographic, and content tags in sub-second latency.
Exposed Tools
| Tool | Input Schema | Output | Use Case |
|---|---|---|---|
search_manga |
query, filters, limit |
MangaList |
Free-text search |
get_manga_details |
manga_id |
MangaDetail |
Full metadata for a specific title |
filter_by_genre |
genres[], demographics[], content_rating |
MangaList |
Genre-scoped browse |
get_series_volumes |
series_id |
VolumeList |
All volumes in a series |
search_by_author |
author_name, limit |
MangaList |
Author bibliography |
RAG Pipeline
flowchart TD
A([Tool Call: search_manga\nquery='dark fantasy action']) --> B[Query Preprocessor\nLanguage detect · Romanisation · Synonym expand]
B --> C[Embed Query\nTitan Embed v2 · 1024-dim]
C --> D{Hybrid Retrieval}
D --> D1[Dense ANN Search\nOpenSearch k-NN · FAISS HNSW]
D --> D2[BM25 Sparse Search\nTitle · Author · Tags]
D1 --> E[RRF Score Fusion]
D2 --> E
E --> F[Metadata Filter Post-processing\ncontent_rating · demographic · language]
F --> G[Cross-Encoder Rerank\nBGE-reranker-v2-m3]
G --> H[Top-5 Results\nStructured MangaList]
H --> I([Tool Result → Claude])
style A fill:#4A90D9,color:#fff
style I fill:#27AE60,color:#fff
style D fill:#8E44AD,color:#fff
OpenSearch Index Schema
{
"mappings": {
"properties": {
"manga_id": { "type": "keyword" },
"title_ja": { "type": "text", "analyzer": "kuromoji" },
"title_en": { "type": "text", "analyzer": "english" },
"title_romaji": { "type": "text", "analyzer": "standard" },
"author": { "type": "text" },
"genres": { "type": "keyword" },
"demographic": { "type": "keyword" },
"content_rating": { "type": "keyword" },
"synopsis": { "type": "text", "analyzer": "english" },
"volume_count": { "type": "integer" },
"status": { "type": "keyword" },
"publisher": { "type": "keyword" },
"embedding": { "type": "knn_vector", "dimension": 1024,
"method": { "name": "hnsw", "engine": "faiss",
"parameters": { "m": 16, "ef_construction": 256 } } }
}
}
}
Embedding Strategy
The catalog MCP embeds a composite field, not just the title:
def build_embed_text(manga: Manga) -> str:
return (
f"{manga.title_en} {manga.title_romaji}. "
f"By {manga.author}. "
f"Genres: {', '.join(manga.genres)}. "
f"Demographic: {manga.demographic}. "
f"{manga.synopsis[:300]}"
)
Why composite? A user searching "samurai revenge story" won't match "Vagabond" by title alone. Embedding the synopsis + tags captures semantic intent.
Multilingual Query Handling
flowchart LR
Q([Raw Query]) --> LD{Language\nDetect}
LD -->|Japanese| RO[Romanise\nkuromoji → romaji]
LD -->|English| EN[Synonym expand\nBerserk→dark_fantasy]
LD -->|Korean| KO[Transliterate\n한글 → romaji]
RO --> EB[Embed unified text]
EN --> EB
KO --> EB
EB --> OS[(OpenSearch\nmultilingual index)]
style Q fill:#4A90D9,color:#fff
style OS fill:#E67E22,color:#fff
MCP Server Implementation (Python)
from mcp.server import Server
from mcp.types import Tool, TextContent
import boto3, json
app = Server("catalog-mcp")
bedrock = boto3.client("bedrock-runtime", region_name="ap-northeast-1")
opensearch = OpenSearchClient(endpoint=CATALOG_OS_ENDPOINT)
@app.tool()
async def search_manga(query: str, filters: dict = {}, limit: int = 5) -> dict:
"""
Search the MangaAssist catalog by free-text query.
Supports English, Japanese (romaji), and Korean queries.
Returns ranked manga list with title, author, genres, cover_url, and manga_id.
"""
# 1. Embed
embedding = embed(query, bedrock)
# 2. Hybrid search
results = opensearch.hybrid_search(
index="manga-catalog",
embedding=embedding,
bm25_query=query,
filters=filters,
k=20,
)
# 3. Rerank
reranked = rerank(query, results, top_n=limit)
# 4. Format
return {
"results": [manga_to_dict(r) for r in reranked],
"total_found": len(results),
"query_interpreted": query,
}
Cache Layer
flowchart LR
TC([Tool Call]) --> CH{ElastiCache\nCache hit?}
CH -->|HIT TTL=300s| CR([Cached Result])
CH -->|MISS| ES[OpenSearch\nRAG pipeline]
ES --> WC[Write to cache\nkey=hash query+filters]
WC --> FR([Fresh Result])
style TC fill:#4A90D9,color:#fff
style CR fill:#27AE60,color:#fff
style FR fill:#27AE60,color:#fff
Cache key: sha256(f"{query}|{sorted_filters}|{limit}") — deterministic across users for popular searches.
Failure Modes & Mitigations
| Failure | Root Cause | Mitigation |
|---|---|---|
| Zero results | Over-narrow filters | Auto-broaden: drop least-restrictive filter, retry |
| Stale embedding | Schema change without re-index | Index versioned with alias; blue/green re-index |
| Language misdetect | Short query (<3 chars) | Fallback: try all three language analyzers, union results |
| HNSW recall drop | High m for memory savings |
Monitor recall@10 in CloudWatch; auto-alert if <0.85 |
| Cold start | ECS task scale-out | Minimum 2 tasks always warm; embedding service on provisioned concurrency |
Latency Breakdown (P99)
gantt
title Catalog MCP P99 Latency Budget (800ms total)
dateFormat X
axisFormat %Lms
section Pipeline
Query preprocess :0, 20
Titan Embed v2 :20, 70
OpenSearch hybrid :70, 270
RRF fusion :270, 300
BGE rerank :300, 400
Format + serialize :400, 420
Network overhead :420, 500
Interview Grill
Q: Why HNSW and not IVF-PQ for 5M vectors? A: At 5M docs, HNSW fits comfortably in OpenSearch's JVM heap (~40GB cluster). IVF-PQ reduces memory further but requires training a codebook and degrades recall. We chose HNSW + ef_search=128 for recall@10 > 0.95 without tuning complexity.
Q: How do you handle a manga that has been re-titled for different markets?
A: All aliases (JP, EN, Romaji, regional titles) are stored in an aliases[] field and included in the BM25 text field. The embedding is built from the primary title + synopsis. Alias matches score via BM25; semantic matches via dense vector.
Q: What happens when OpenSearch is degraded?
A: The MCP server has a circuit breaker (Resilience4j pattern). On three consecutive timeouts, it falls back to DynamoDB's GSI-title-index for exact-match lookup and returns a degraded response with "source": "fallback" in the metadata.
Q: How do you prevent the reranker from being the bottleneck? A: BGE-reranker runs on a SageMaker real-time endpoint. We rerank only top-20 candidates, not the full result set. P99 reranker latency is 100ms. If it exceeds 200ms (CloudWatch alarm), we skip reranking and return BM25+dense-fused scores directly.