Cross-Title Link MCP — Graph-Based "More Like This" Recommendations
Purpose
Answers relational queries: "If you liked Berserk, you'll like…", "Other works by this author", "Read the series in order", "Titles in the same universe". Uses a Neptune graph database for traversal-based discovery combined with OpenSearch for semantic similarity — a graph + RAG hybrid.
Exposed Tools
| Tool | Input | Output | Use Case |
|---|---|---|---|
get_similar_titles |
manga_id, limit, basis? |
SimilarList |
Semantic/graph similarity |
get_author_other_works |
author_name |
AuthorBibliography |
Same creator's catalog |
get_series_order |
series_id |
ReadingOrder |
Correct reading sequence |
get_shared_universe |
manga_id |
UniverseMap |
Cross-title same-universe |
get_sequel_prequel |
manga_id |
SeriesChain |
Sequels, prequels, spin-offs |
compare_titles |
manga_ids[] |
ComparisonSummary |
Side-by-side title comparison |
Graph + RAG Architecture
flowchart TD
TC([Tool Call: get_similar_titles\nmanga_id='BERSERK' basis='theme']) --> GP[Graph Path\nNeptune traversal]
TC --> SP[Semantic Path\nOpenSearch kNN]
GP --> NT[Neptune\nWHERE genre IN shared_genres\n2-hop author/publisher links]
NT --> GC[Graph Candidates\nwith relationship types]
SP --> EB[Embed manga synopsis\nTitan Embed v2]
EB --> OS[(OpenSearch\nCatalog kNN)]
OS --> SC[Semantic Candidates]
GC --> FM[Fusion Mixer\ngraph_score × semantic_score]
SC --> FM
FM --> RK[Cross-Encoder Rerank]
RK --> DE[Diversity Filter\nno 2 same author in top-5]
DE --> TR([Tool Result → Claude])
style TC fill:#4A90D9,color:#fff
style TR fill:#27AE60,color:#fff
style FM fill:#8E44AD,color:#fff
Neptune Graph Schema
graph LR
M1[Manga Node\nBerserk] -->|SAME_GENRE| M2[Manga Node\nVinland Saga]
M1 -->|SAME_AUTHOR| M3[Manga Node\nKentaro Miura Works]
M1 -->|CO_PURCHASED_WITH| M4[Manga Node\nVagabond]
M1 -->|SAME_UNIVERSE| M5[Manga Node\nYoung Animal Series]
A1[Author Node\nKentaro Miura] -->|CREATED| M1
P1[Publisher Node\nHakusensha] -->|PUBLISHED| M1
G1[Genre Node\nDark Fantasy] -->|TAGGED| M1
G1 -->|TAGGED| M2
U1[User Node\nAgg. readers] -->|ALSO_READ| M4
style M1 fill:#8E44AD,color:#fff
style A1 fill:#E67E22,color:#fff
style G1 fill:#4A90D9,color:#fff
Edge Types and Weights
| Edge Type | Source | Weight Basis |
|---|---|---|
SAME_GENRE |
Catalog metadata | Jaccard similarity of genre tags |
CO_PURCHASED_WITH |
Order history | Purchase co-occurrence count (normalised) |
ALSO_READ |
Reading history | Reading co-occurrence count |
SAME_AUTHOR |
Catalog metadata | Binary (0 or 1) |
SAME_PUBLISHER |
Catalog metadata | Binary |
SAME_UNIVERSE |
Editorial metadata | Binary |
SEQUEL_OF |
Editorial metadata | Directional, sequence number |
SPIN_OFF_OF |
Editorial metadata | Directional |
Graph Traversal Query (Gremlin)
// "Find manga similar to Berserk via 2-hop genre + co-purchase links"
g.V().has('manga_id', 'BERSERK')
.outE('SAME_GENRE', 'CO_PURCHASED_WITH', 'ALSO_READ')
.has('weight', gt(0.3)) // minimum edge weight threshold
.inV()
.where(neq('BERSERK'))
.dedup()
.order().by(
__.inE('CO_PURCHASED_WITH').values('weight').sum(), desc
)
.limit(50)
.project('manga_id', 'title', 'graph_score')
.by('manga_id')
.by('title_en')
.by(__.inE('CO_PURCHASED_WITH').values('weight').sum())
Reading Order Resolution
flowchart LR
TC2([get_series_order\nseries_id='JOJO']) --> NQ[Neptune Query\nSEQUEL_OF chain traversal]
NQ --> SO[Sorted Parts\nby sequence_number]
SO --> VM[Volume Map\nPart → Volume range]
VM --> RM[Reading Mode\nchronological vs publication]
RM --> TR2([ReadingOrder result\nwith spin-off branching])
style TC2 fill:#4A90D9,color:#fff
style TR2 fill:#27AE60,color:#fff
JoJo's Bizarre Adventure has 9 parts — each is a SEQUEL_OF chain. Spin-offs like "Thus Spoke Kishibe Rohan" are SPIN_OFF_OF Part 4, included in the reading order with a recommended_after field.
Semantic Similarity Path
For queries where graph edges are sparse (new titles, indie manga), the MCP falls back to pure semantic similarity using the manga embedding stored in OpenSearch:
async def semantic_similar(manga_id: str, limit: int) -> list[MangaScore]:
# Fetch target manga's pre-computed embedding from OpenSearch
target = await opensearch.get_document("manga-catalog", manga_id)
target_emb = target["_source"]["embedding"]
# kNN search using that embedding as the query vector
results = await opensearch.knn_search(
index="manga-catalog",
vector=target_emb,
k=limit + 10, # over-fetch for deduplication
exclude_ids=[manga_id],
)
return [MangaScore(r["_id"], r["_score"]) for r in results]
Comparison Tool: Side-by-Side
flowchart LR
TC3([compare_titles\nBerserk vs Vagabond]) --> FA[Fetch both manga\nfrom Catalog MCP]
FA --> GS[Graph Similarity\nShared edges in Neptune]
FA --> SS[Semantic Similarity\nCosine distance between embeddings]
FA --> RV[Review Scores\nfrom Review MCP]
GS --> CB[Comparison Builder]
SS --> CB
RV --> CB
CB --> TR3([ComparisonSummary\nStructured side-by-side])
style TC3 fill:#4A90D9,color:#fff
style TR3 fill:#27AE60,color:#fff
Graph Freshness: Keeping Neptune Current
sequenceDiagram
participant OrdersRDS
participant GLambda as Graph Update Lambda
participant Neptune
participant EventBridge
Note over OrdersRDS,Neptune: Nightly batch — co-purchase graph refresh
EventBridge->>GLambda: Trigger at 02:00 JST
GLambda->>OrdersRDS: SELECT manga pairs co-purchased in last 30 days
GLambda->>Neptune: Upsert CO_PURCHASED_WITH edges with new weights
GLambda->>Neptune: Decay old edges (weight × 0.95 per day)
Note over EventBridge,Neptune: Real-time — new catalog additions
EventBridge->>GLambda: New manga published event
GLambda->>Neptune: Add manga vertex + SAME_GENRE edges from metadata
Failure Modes
| Failure | Symptom | Mitigation |
|---|---|---|
| Neptune query timeout | Similar titles slow on dense hub nodes (One Piece) | Max traversal depth = 2; top-50 limit per edge type; query timeout = 500ms with fallback to semantic-only |
| Graph staleness | New popular title not yet in co-purchase graph | Semantic-only path handles new titles; co-purchase edges populate within 24h |
| Circular traversal | Sequel chains with errors loop indefinitely | Gremlin dedup() + max 20-hop limit on all traversals |
| Recommendation monoculture | All recommendations from same publisher | Diversity filter: max 2 same-publisher titles in top-5 results |
Interview Grill
Q: Why Neptune instead of storing graph data in DynamoDB with adjacency lists? A: Multi-hop traversal is where Neptune wins. "Find all manga with >0.3 co-purchase weight that also share ≥2 genres with a specific title" is a 2-hop join in Gremlin — it's a sequence of DynamoDB batch-gets with in-memory set intersection in application code. Neptune does this in a single declarative query with optimised graph execution plans.
Q: How do you handle the "cold graph" problem for a new manga with no purchase co-occurrence? A: New titles ≤30 days old fall through to the semantic similarity path exclusively. As they accumulate co-purchase history, Neptune edges form and the graph path starts contributing to the fusion score. The content-based start is intentional — semantic similarity is often more accurate than thin co-purchase data anyway.
Q: What prevents a "rich get richer" effect where popular titles always dominate similar-title results? A: Edge weights are normalised by the source manga's total edge weight sum — making it a relative, not absolute, score. A niche title with 10 very strong co-purchase links can outrank a popular title with 1000 weaker links.
Q: How accurate is the "same universe" detection? A: Universe membership is editorial-curated, not algorithmic. The content team maintains a Neptune graph of universe nodes (Young Animal universe, Shonen Jump universe, etc.) linked to manga vertices. This is the only edge type that doesn't have automated updating — it's too nuanced for a classifier.