LOCAL PREVIEW View on GitHub

Catalog Search MCP — Manga Title & Metadata Retrieval

Purpose

The Catalog Search MCP is the primary discovery surface of the chatbot. It lets Claude search 5M+ manga titles across title, genre, author, publisher, demographic, and content tags in sub-second latency.


Exposed Tools

Tool Input Schema Output Use Case
search_manga query, filters, limit MangaList Free-text search
get_manga_details manga_id MangaDetail Full metadata for a specific title
filter_by_genre genres[], demographics[], content_rating MangaList Genre-scoped browse
get_series_volumes series_id VolumeList All volumes in a series
search_by_author author_name, limit MangaList Author bibliography

RAG Pipeline

flowchart TD
    A([Tool Call: search_manga\nquery='dark fantasy action']) --> B[Query Preprocessor\nLanguage detect · Romanisation · Synonym expand]
    B --> C[Embed Query\nTitan Embed v2 · 1024-dim]
    C --> D{Hybrid Retrieval}
    D --> D1[Dense ANN Search\nOpenSearch k-NN · FAISS HNSW]
    D --> D2[BM25 Sparse Search\nTitle · Author · Tags]
    D1 --> E[RRF Score Fusion]
    D2 --> E
    E --> F[Metadata Filter Post-processing\ncontent_rating · demographic · language]
    F --> G[Cross-Encoder Rerank\nBGE-reranker-v2-m3]
    G --> H[Top-5 Results\nStructured MangaList]
    H --> I([Tool Result → Claude])

    style A fill:#4A90D9,color:#fff
    style I fill:#27AE60,color:#fff
    style D fill:#8E44AD,color:#fff

OpenSearch Index Schema

{
  "mappings": {
    "properties": {
      "manga_id":        { "type": "keyword" },
      "title_ja":        { "type": "text", "analyzer": "kuromoji" },
      "title_en":        { "type": "text", "analyzer": "english" },
      "title_romaji":    { "type": "text", "analyzer": "standard" },
      "author":          { "type": "text" },
      "genres":          { "type": "keyword" },
      "demographic":     { "type": "keyword" },
      "content_rating":  { "type": "keyword" },
      "synopsis":        { "type": "text", "analyzer": "english" },
      "volume_count":    { "type": "integer" },
      "status":          { "type": "keyword" },
      "publisher":       { "type": "keyword" },
      "embedding":       { "type": "knn_vector", "dimension": 1024,
                           "method": { "name": "hnsw", "engine": "faiss",
                                       "parameters": { "m": 16, "ef_construction": 256 } } }
    }
  }
}

Embedding Strategy

The catalog MCP embeds a composite field, not just the title:

def build_embed_text(manga: Manga) -> str:
    return (
        f"{manga.title_en} {manga.title_romaji}. "
        f"By {manga.author}. "
        f"Genres: {', '.join(manga.genres)}. "
        f"Demographic: {manga.demographic}. "
        f"{manga.synopsis[:300]}"
    )

Why composite? A user searching "samurai revenge story" won't match "Vagabond" by title alone. Embedding the synopsis + tags captures semantic intent.


Multilingual Query Handling

flowchart LR
    Q([Raw Query]) --> LD{Language\nDetect}
    LD -->|Japanese| RO[Romanise\nkuromoji → romaji]
    LD -->|English| EN[Synonym expand\nBerserk→dark_fantasy]
    LD -->|Korean| KO[Transliterate\n한글 → romaji]
    RO --> EB[Embed unified text]
    EN --> EB
    KO --> EB
    EB --> OS[(OpenSearch\nmultilingual index)]

    style Q fill:#4A90D9,color:#fff
    style OS fill:#E67E22,color:#fff

MCP Server Implementation (Python)

from mcp.server import Server
from mcp.types import Tool, TextContent
import boto3, json

app = Server("catalog-mcp")
bedrock = boto3.client("bedrock-runtime", region_name="ap-northeast-1")
opensearch = OpenSearchClient(endpoint=CATALOG_OS_ENDPOINT)

@app.tool()
async def search_manga(query: str, filters: dict = {}, limit: int = 5) -> dict:
    """
    Search the MangaAssist catalog by free-text query.
    Supports English, Japanese (romaji), and Korean queries.
    Returns ranked manga list with title, author, genres, cover_url, and manga_id.
    """
    # 1. Embed
    embedding = embed(query, bedrock)

    # 2. Hybrid search
    results = opensearch.hybrid_search(
        index="manga-catalog",
        embedding=embedding,
        bm25_query=query,
        filters=filters,
        k=20,
    )

    # 3. Rerank
    reranked = rerank(query, results, top_n=limit)

    # 4. Format
    return {
        "results": [manga_to_dict(r) for r in reranked],
        "total_found": len(results),
        "query_interpreted": query,
    }

Cache Layer

flowchart LR
    TC([Tool Call]) --> CH{ElastiCache\nCache hit?}
    CH -->|HIT TTL=300s| CR([Cached Result])
    CH -->|MISS| ES[OpenSearch\nRAG pipeline]
    ES --> WC[Write to cache\nkey=hash query+filters]
    WC --> FR([Fresh Result])

    style TC fill:#4A90D9,color:#fff
    style CR fill:#27AE60,color:#fff
    style FR fill:#27AE60,color:#fff

Cache key: sha256(f"{query}|{sorted_filters}|{limit}") — deterministic across users for popular searches.


Failure Modes & Mitigations

Failure Root Cause Mitigation
Zero results Over-narrow filters Auto-broaden: drop least-restrictive filter, retry
Stale embedding Schema change without re-index Index versioned with alias; blue/green re-index
Language misdetect Short query (<3 chars) Fallback: try all three language analyzers, union results
HNSW recall drop High m for memory savings Monitor recall@10 in CloudWatch; auto-alert if <0.85
Cold start ECS task scale-out Minimum 2 tasks always warm; embedding service on provisioned concurrency

Latency Breakdown (P99)

gantt
    title Catalog MCP P99 Latency Budget (800ms total)
    dateFormat  X
    axisFormat  %Lms

    section Pipeline
    Query preprocess    :0, 20
    Titan Embed v2      :20, 70
    OpenSearch hybrid   :70, 270
    RRF fusion          :270, 300
    BGE rerank          :300, 400
    Format + serialize  :400, 420
    Network overhead    :420, 500

Interview Grill

Q: Why HNSW and not IVF-PQ for 5M vectors? A: At 5M docs, HNSW fits comfortably in OpenSearch's JVM heap (~40GB cluster). IVF-PQ reduces memory further but requires training a codebook and degrades recall. We chose HNSW + ef_search=128 for recall@10 > 0.95 without tuning complexity.

Q: How do you handle a manga that has been re-titled for different markets? A: All aliases (JP, EN, Romaji, regional titles) are stored in an aliases[] field and included in the BM25 text field. The embedding is built from the primary title + synopsis. Alias matches score via BM25; semantic matches via dense vector.

Q: What happens when OpenSearch is degraded? A: The MCP server has a circuit breaker (Resilience4j pattern). On three consecutive timeouts, it falls back to DynamoDB's GSI-title-index for exact-match lookup and returns a degraded response with "source": "fallback" in the metadata.

Q: How do you prevent the reranker from being the bottleneck? A: BGE-reranker runs on a SageMaker real-time endpoint. We rerank only top-20 candidates, not the full result set. P99 reranker latency is 100ms. If it exceeds 200ms (CloudWatch alarm), we skip reranking and return BM25+dense-fused scores directly.