Order & Inventory MCP — Real-Time Orders and Stock
Purpose
Handles all transactional queries: order status, stock availability, shipping ETAs, and price lookups. Unlike purely RAG-based MCPs, this server uses a hybrid pattern — live DB reads for real-time data, with lightweight RAG over FAQ/policy docs for order-related support questions.
Exposed Tools
| Tool | Input | Output | Use Case |
|---|---|---|---|
get_order_status |
user_id, order_id? |
OrderStatusList |
"Where is my order?" |
check_stock |
manga_id, volume_number? |
StockInfo |
"Is vol 3 available?" |
get_price |
manga_id, volume_number? |
PriceInfo |
"How much is Berserk vol 1?" |
get_shipping_estimate |
manga_id[], delivery_address |
ShippingEstimate |
Delivery time + cost |
list_recent_orders |
user_id, limit |
OrderList |
Order history |
Hybrid Architecture: Live API + RAG
flowchart TD
TC([Tool Call]) --> RT{Data type?}
RT -->|Real-time\norder status / stock| LP[Live Path]
RT -->|Policy / FAQ\norder-related question| RP[RAG Path]
LP --> EC{ElastiCache\nCache hit?}
EC -->|HIT TTL=30s| CR([Cached Result])
EC -->|MISS| RDS[(RDS Aurora\norder_items · inventory)]
RDS --> WC[Write cache TTL=30s]
WC --> LR([Live Result])
RP --> EB[Embed Question\nTitan v2]
EB --> OS[(OpenSearch\nPolicy Doc Index)]
OS --> RK[Rerank]
RK --> PR([Policy Result])
style TC fill:#4A90D9,color:#fff
style CR fill:#27AE60,color:#fff
style LR fill:#27AE60,color:#fff
style PR fill:#27AE60,color:#fff
style RT fill:#8E44AD,color:#fff
Data Sources
RDS Aurora (Source of Truth)
-- Orders table
SELECT o.order_id, o.status, o.created_at,
oi.manga_id, oi.volume_number, oi.quantity, oi.unit_price,
s.carrier, s.tracking_number, s.estimated_delivery
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
LEFT JOIN shipments s ON o.order_id = s.order_id
WHERE o.user_id = :user_id
ORDER BY o.created_at DESC
LIMIT 10;
ElastiCache (Redis) — Cache Strategy
CACHE_TTL = {
"order_status": 30, # seconds — orders change infrequently once shipped
"stock_level": 10, # seconds — inventory is high-frequency
"price": 300, # seconds — prices stable within 5 min window
"shipping_eta": 120, # seconds — carrier estimates refresh slowly
}
OpenSearch — Policy Doc Index (RAG component)
- Documents: return policies, cancellation policies, pre-order terms, digital vs physical FAQs
- Chunked at 512 tokens with 64-token overlap
- Indexed with both BM25 and dense vectors
Stock Check Flow
sequenceDiagram
participant Claude
participant OrderMCP
participant Redis
participant Aurora
participant Kinesis
Claude->>OrderMCP: check_stock(manga_id="BER42", volume=42)
OrderMCP->>Redis: GET stock:BER42:42
alt Cache HIT
Redis-->>OrderMCP: {quantity: 14, warehouse: "TKY-1"}
OrderMCP-->>Claude: In stock · 14 units · ships from Tokyo
else Cache MISS
OrderMCP->>Aurora: SELECT quantity FROM inventory WHERE ...
Aurora-->>OrderMCP: {quantity: 14}
OrderMCP->>Redis: SET stock:BER42:42 TTL=10s
OrderMCP->>Kinesis: Publish stock_check_event (analytics)
OrderMCP-->>Claude: In stock · 14 units · ships from Tokyo
end
Order Status State Machine
stateDiagram-v2
[*] --> Placed
Placed --> PaymentPending
PaymentPending --> Confirmed : Payment success
PaymentPending --> Cancelled : Payment failed / timeout
Confirmed --> Packed
Packed --> Shipped
Shipped --> OutForDelivery
OutForDelivery --> Delivered
Shipped --> DeliveryFailed
DeliveryFailed --> ReturnInitiated
Delivered --> [*]
Cancelled --> [*]
ReturnInitiated --> Refunded
Refunded --> [*]
The MCP maps each state to a user-friendly message and estimated next step.
Price Integrity: Preventing Hallucination
A critical risk: Claude could confabulate a price if the MCP returns nothing. Mitigation — the MCP always returns a structured price object, never empty:
@app.tool()
async def get_price(manga_id: str, volume_number: int | None = None) -> dict:
"""
Returns current retail price and discount status for a manga volume.
NEVER infer or estimate prices — always return from this tool.
"""
price = await aurora.fetch_price(manga_id, volume_number)
if price is None:
return {
"status": "not_found",
"message": f"Price unavailable for manga_id={manga_id}. Do not estimate.",
"manga_id": manga_id,
}
return {
"status": "found",
"price_jpy": price.amount,
"currency": "JPY",
"discount_active": price.discount_pct > 0,
"discount_pct": price.discount_pct,
"valid_until": price.valid_until.isoformat(),
}
The "Do not estimate." instruction in the not_found message is deliberate — it surfaces in the tool result that Claude reads, reinforcing the no-hallucination constraint for prices.
RAG Path: Order-Related Policy Lookup
When a question is about policy (returns, pre-orders, cancellations), the MCP uses RAG:
flowchart LR
Q([e.g. 'Can I cancel a\npre-order?']) --> EB[Embed\nTitan v2]
EB --> OS[(OpenSearch\nPolicy Docs)]
OS --> HY[Hybrid Retrieval\nBM25 + kNN]
HY --> RK[Rerank]
RK --> CH[Context chunk\nwith policy text + version]
CH --> TR([Tool Result\nPolicy answer + source URL])
style Q fill:#4A90D9,color:#fff
style TR fill:#27AE60,color:#fff
Policy documents are versioned and stored in S3. The OpenSearch index carries a doc_version field. Stale chunks (version mismatch) are excluded from retrieval.
Failure Modes & Mitigations
| Failure | Symptom | Mitigation |
|---|---|---|
| RDS connection pool exhaustion | Timeout on order_status | Read replica for chatbot queries; pool size = 50 |
| Cache stampede on stock | All users query same hot item post-announcement | Cache-aside with jitter + probabilistic early expiry |
| Price data stale after promotion starts | Wrong price shown | Promotion start triggers cache invalidation via EventBridge |
| Tracking API timeout (carrier) | Shipping ETA unavailable | Return last-known ETA with "freshness": "stale" flag |
| Policy doc version mismatch | Outdated return policy in context | Versioned index alias; old chunks auto-excluded after 24h |
Interview Grill
Q: Why cache stock at only 10-second TTL? A: Inventory is a shared mutable resource. At 10M users, a popular volume going out of stock could trigger thousands of simultaneous checks. 10s TTL means at worst we show stock counts from 10s ago — acceptable for a chatbot (not a checkout system). At checkout, we bypass cache entirely.
Q: How do you prevent the LLM from hallucinating order details?
A: Tool schema is strictly typed. The LLM is instructed: "Only report order information returned by get_order_status. Never infer." The tool result also includes "source": "aurora_live" so Claude knows it's authoritative. Tool descriptions use the word "NEVER infer."
Q: What if a user asks for an order that belongs to a different user?
A: The user_id in the tool call is extracted from the JWT token, not trusted from the user's message. The MCP ignores any user_id in the query string and always uses the authenticated identity from the API Gateway authoriser context.
Q: Why not use DynamoDB instead of RDS for orders? A: Orders require multi-item ACID transactions (order + payment + inventory decrement). RDS Aurora Multi-AZ gives us that. DynamoDB transactions exist but are limited to 25 items and require careful schema design. The read replica satisfies chatbot read latency without competing with write-heavy order processing.