LOCAL PREVIEW View on GitHub

Cross-Title Link MCP — Graph-Based "More Like This" Recommendations

Purpose

Answers relational queries: "If you liked Berserk, you'll like…", "Other works by this author", "Read the series in order", "Titles in the same universe". Uses a Neptune graph database for traversal-based discovery combined with OpenSearch for semantic similarity — a graph + RAG hybrid.


Exposed Tools

Tool Input Output Use Case
get_similar_titles manga_id, limit, basis? SimilarList Semantic/graph similarity
get_author_other_works author_name AuthorBibliography Same creator's catalog
get_series_order series_id ReadingOrder Correct reading sequence
get_shared_universe manga_id UniverseMap Cross-title same-universe
get_sequel_prequel manga_id SeriesChain Sequels, prequels, spin-offs
compare_titles manga_ids[] ComparisonSummary Side-by-side title comparison

Graph + RAG Architecture

flowchart TD
    TC([Tool Call: get_similar_titles\nmanga_id='BERSERK' basis='theme']) --> GP[Graph Path\nNeptune traversal]
    TC --> SP[Semantic Path\nOpenSearch kNN]

    GP --> NT[Neptune\nWHERE genre IN shared_genres\n2-hop author/publisher links]
    NT --> GC[Graph Candidates\nwith relationship types]

    SP --> EB[Embed manga synopsis\nTitan Embed v2]
    EB --> OS[(OpenSearch\nCatalog kNN)]
    OS --> SC[Semantic Candidates]

    GC --> FM[Fusion Mixer\ngraph_score × semantic_score]
    SC --> FM
    FM --> RK[Cross-Encoder Rerank]
    RK --> DE[Diversity Filter\nno 2 same author in top-5]
    DE --> TR([Tool Result → Claude])

    style TC fill:#4A90D9,color:#fff
    style TR fill:#27AE60,color:#fff
    style FM fill:#8E44AD,color:#fff

Neptune Graph Schema

graph LR
    M1[Manga Node\nBerserk] -->|SAME_GENRE| M2[Manga Node\nVinland Saga]
    M1 -->|SAME_AUTHOR| M3[Manga Node\nKentaro Miura Works]
    M1 -->|CO_PURCHASED_WITH| M4[Manga Node\nVagabond]
    M1 -->|SAME_UNIVERSE| M5[Manga Node\nYoung Animal Series]
    A1[Author Node\nKentaro Miura] -->|CREATED| M1
    P1[Publisher Node\nHakusensha] -->|PUBLISHED| M1
    G1[Genre Node\nDark Fantasy] -->|TAGGED| M1
    G1 -->|TAGGED| M2
    U1[User Node\nAgg. readers] -->|ALSO_READ| M4

    style M1 fill:#8E44AD,color:#fff
    style A1 fill:#E67E22,color:#fff
    style G1 fill:#4A90D9,color:#fff

Edge Types and Weights

Edge Type Source Weight Basis
SAME_GENRE Catalog metadata Jaccard similarity of genre tags
CO_PURCHASED_WITH Order history Purchase co-occurrence count (normalised)
ALSO_READ Reading history Reading co-occurrence count
SAME_AUTHOR Catalog metadata Binary (0 or 1)
SAME_PUBLISHER Catalog metadata Binary
SAME_UNIVERSE Editorial metadata Binary
SEQUEL_OF Editorial metadata Directional, sequence number
SPIN_OFF_OF Editorial metadata Directional

Graph Traversal Query (Gremlin)

// "Find manga similar to Berserk via 2-hop genre + co-purchase links"
g.V().has('manga_id', 'BERSERK')
  .outE('SAME_GENRE', 'CO_PURCHASED_WITH', 'ALSO_READ')
  .has('weight', gt(0.3))  // minimum edge weight threshold
  .inV()
  .where(neq('BERSERK'))
  .dedup()
  .order().by(
    __.inE('CO_PURCHASED_WITH').values('weight').sum(), desc
  )
  .limit(50)
  .project('manga_id', 'title', 'graph_score')
    .by('manga_id')
    .by('title_en')
    .by(__.inE('CO_PURCHASED_WITH').values('weight').sum())

Reading Order Resolution

flowchart LR
    TC2([get_series_order\nseries_id='JOJO']) --> NQ[Neptune Query\nSEQUEL_OF chain traversal]
    NQ --> SO[Sorted Parts\nby sequence_number]
    SO --> VM[Volume Map\nPart → Volume range]
    VM --> RM[Reading Mode\nchronological vs publication]
    RM --> TR2([ReadingOrder result\nwith spin-off branching])

    style TC2 fill:#4A90D9,color:#fff
    style TR2 fill:#27AE60,color:#fff

JoJo's Bizarre Adventure has 9 parts — each is a SEQUEL_OF chain. Spin-offs like "Thus Spoke Kishibe Rohan" are SPIN_OFF_OF Part 4, included in the reading order with a recommended_after field.


Semantic Similarity Path

For queries where graph edges are sparse (new titles, indie manga), the MCP falls back to pure semantic similarity using the manga embedding stored in OpenSearch:

async def semantic_similar(manga_id: str, limit: int) -> list[MangaScore]:
    # Fetch target manga's pre-computed embedding from OpenSearch
    target = await opensearch.get_document("manga-catalog", manga_id)
    target_emb = target["_source"]["embedding"]

    # kNN search using that embedding as the query vector
    results = await opensearch.knn_search(
        index="manga-catalog",
        vector=target_emb,
        k=limit + 10,   # over-fetch for deduplication
        exclude_ids=[manga_id],
    )
    return [MangaScore(r["_id"], r["_score"]) for r in results]

Comparison Tool: Side-by-Side

flowchart LR
    TC3([compare_titles\nBerserk vs Vagabond]) --> FA[Fetch both manga\nfrom Catalog MCP]
    FA --> GS[Graph Similarity\nShared edges in Neptune]
    FA --> SS[Semantic Similarity\nCosine distance between embeddings]
    FA --> RV[Review Scores\nfrom Review MCP]
    GS --> CB[Comparison Builder]
    SS --> CB
    RV --> CB
    CB --> TR3([ComparisonSummary\nStructured side-by-side])

    style TC3 fill:#4A90D9,color:#fff
    style TR3 fill:#27AE60,color:#fff

Graph Freshness: Keeping Neptune Current

sequenceDiagram
    participant OrdersRDS
    participant GLambda as Graph Update Lambda
    participant Neptune
    participant EventBridge

    Note over OrdersRDS,Neptune: Nightly batch — co-purchase graph refresh
    EventBridge->>GLambda: Trigger at 02:00 JST
    GLambda->>OrdersRDS: SELECT manga pairs co-purchased in last 30 days
    GLambda->>Neptune: Upsert CO_PURCHASED_WITH edges with new weights
    GLambda->>Neptune: Decay old edges (weight × 0.95 per day)

    Note over EventBridge,Neptune: Real-time — new catalog additions
    EventBridge->>GLambda: New manga published event
    GLambda->>Neptune: Add manga vertex + SAME_GENRE edges from metadata

Failure Modes

Failure Symptom Mitigation
Neptune query timeout Similar titles slow on dense hub nodes (One Piece) Max traversal depth = 2; top-50 limit per edge type; query timeout = 500ms with fallback to semantic-only
Graph staleness New popular title not yet in co-purchase graph Semantic-only path handles new titles; co-purchase edges populate within 24h
Circular traversal Sequel chains with errors loop indefinitely Gremlin dedup() + max 20-hop limit on all traversals
Recommendation monoculture All recommendations from same publisher Diversity filter: max 2 same-publisher titles in top-5 results

Interview Grill

Q: Why Neptune instead of storing graph data in DynamoDB with adjacency lists? A: Multi-hop traversal is where Neptune wins. "Find all manga with >0.3 co-purchase weight that also share ≥2 genres with a specific title" is a 2-hop join in Gremlin — it's a sequence of DynamoDB batch-gets with in-memory set intersection in application code. Neptune does this in a single declarative query with optimised graph execution plans.

Q: How do you handle the "cold graph" problem for a new manga with no purchase co-occurrence? A: New titles ≤30 days old fall through to the semantic similarity path exclusively. As they accumulate co-purchase history, Neptune edges form and the graph path starts contributing to the fusion score. The content-based start is intentional — semantic similarity is often more accurate than thin co-purchase data anyway.

Q: What prevents a "rich get richer" effect where popular titles always dominate similar-title results? A: Edge weights are normalised by the source manga's total edge weight sum — making it a relative, not absolute, score. A niche title with 10 very strong co-purchase links can outrank a popular title with 1000 weaker links.

Q: How accurate is the "same universe" detection? A: Universe membership is editorial-curated, not algorithmic. The content team maintains a Neptune graph of universe nodes (Young Animal universe, Shonen Jump universe, etc.) linked to manga vertices. This is the only edge type that doesn't have automated updating — it's too nuanced for a classifier.