LOCAL PREVIEW View on GitHub

Order & Inventory MCP — Real-Time Orders and Stock

Purpose

Handles all transactional queries: order status, stock availability, shipping ETAs, and price lookups. Unlike purely RAG-based MCPs, this server uses a hybrid pattern — live DB reads for real-time data, with lightweight RAG over FAQ/policy docs for order-related support questions.


Exposed Tools

Tool Input Output Use Case
get_order_status user_id, order_id? OrderStatusList "Where is my order?"
check_stock manga_id, volume_number? StockInfo "Is vol 3 available?"
get_price manga_id, volume_number? PriceInfo "How much is Berserk vol 1?"
get_shipping_estimate manga_id[], delivery_address ShippingEstimate Delivery time + cost
list_recent_orders user_id, limit OrderList Order history

Hybrid Architecture: Live API + RAG

flowchart TD
    TC([Tool Call]) --> RT{Data type?}

    RT -->|Real-time\norder status / stock| LP[Live Path]
    RT -->|Policy / FAQ\norder-related question| RP[RAG Path]

    LP --> EC{ElastiCache\nCache hit?}
    EC -->|HIT TTL=30s| CR([Cached Result])
    EC -->|MISS| RDS[(RDS Aurora\norder_items · inventory)]
    RDS --> WC[Write cache TTL=30s]
    WC --> LR([Live Result])

    RP --> EB[Embed Question\nTitan v2]
    EB --> OS[(OpenSearch\nPolicy Doc Index)]
    OS --> RK[Rerank]
    RK --> PR([Policy Result])

    style TC fill:#4A90D9,color:#fff
    style CR fill:#27AE60,color:#fff
    style LR fill:#27AE60,color:#fff
    style PR fill:#27AE60,color:#fff
    style RT fill:#8E44AD,color:#fff

Data Sources

RDS Aurora (Source of Truth)

-- Orders table
SELECT o.order_id, o.status, o.created_at,
       oi.manga_id, oi.volume_number, oi.quantity, oi.unit_price,
       s.carrier, s.tracking_number, s.estimated_delivery
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
LEFT JOIN shipments s ON o.order_id = s.order_id
WHERE o.user_id = :user_id
ORDER BY o.created_at DESC
LIMIT 10;

ElastiCache (Redis) — Cache Strategy

CACHE_TTL = {
    "order_status":  30,   # seconds — orders change infrequently once shipped
    "stock_level":   10,   # seconds — inventory is high-frequency
    "price":         300,  # seconds — prices stable within 5 min window
    "shipping_eta":  120,  # seconds — carrier estimates refresh slowly
}

OpenSearch — Policy Doc Index (RAG component)

  • Documents: return policies, cancellation policies, pre-order terms, digital vs physical FAQs
  • Chunked at 512 tokens with 64-token overlap
  • Indexed with both BM25 and dense vectors

Stock Check Flow

sequenceDiagram
    participant Claude
    participant OrderMCP
    participant Redis
    participant Aurora
    participant Kinesis

    Claude->>OrderMCP: check_stock(manga_id="BER42", volume=42)
    OrderMCP->>Redis: GET stock:BER42:42
    alt Cache HIT
        Redis-->>OrderMCP: {quantity: 14, warehouse: "TKY-1"}
        OrderMCP-->>Claude: In stock · 14 units · ships from Tokyo
    else Cache MISS
        OrderMCP->>Aurora: SELECT quantity FROM inventory WHERE ...
        Aurora-->>OrderMCP: {quantity: 14}
        OrderMCP->>Redis: SET stock:BER42:42 TTL=10s
        OrderMCP->>Kinesis: Publish stock_check_event (analytics)
        OrderMCP-->>Claude: In stock · 14 units · ships from Tokyo
    end

Order Status State Machine

stateDiagram-v2
    [*] --> Placed
    Placed --> PaymentPending
    PaymentPending --> Confirmed : Payment success
    PaymentPending --> Cancelled : Payment failed / timeout
    Confirmed --> Packed
    Packed --> Shipped
    Shipped --> OutForDelivery
    OutForDelivery --> Delivered
    Shipped --> DeliveryFailed
    DeliveryFailed --> ReturnInitiated
    Delivered --> [*]
    Cancelled --> [*]
    ReturnInitiated --> Refunded
    Refunded --> [*]

The MCP maps each state to a user-friendly message and estimated next step.


Price Integrity: Preventing Hallucination

A critical risk: Claude could confabulate a price if the MCP returns nothing. Mitigation — the MCP always returns a structured price object, never empty:

@app.tool()
async def get_price(manga_id: str, volume_number: int | None = None) -> dict:
    """
    Returns current retail price and discount status for a manga volume.
    NEVER infer or estimate prices — always return from this tool.
    """
    price = await aurora.fetch_price(manga_id, volume_number)
    if price is None:
        return {
            "status": "not_found",
            "message": f"Price unavailable for manga_id={manga_id}. Do not estimate.",
            "manga_id": manga_id,
        }
    return {
        "status": "found",
        "price_jpy": price.amount,
        "currency": "JPY",
        "discount_active": price.discount_pct > 0,
        "discount_pct": price.discount_pct,
        "valid_until": price.valid_until.isoformat(),
    }

The "Do not estimate." instruction in the not_found message is deliberate — it surfaces in the tool result that Claude reads, reinforcing the no-hallucination constraint for prices.


When a question is about policy (returns, pre-orders, cancellations), the MCP uses RAG:

flowchart LR
    Q([e.g. 'Can I cancel a\npre-order?']) --> EB[Embed\nTitan v2]
    EB --> OS[(OpenSearch\nPolicy Docs)]
    OS --> HY[Hybrid Retrieval\nBM25 + kNN]
    HY --> RK[Rerank]
    RK --> CH[Context chunk\nwith policy text + version]
    CH --> TR([Tool Result\nPolicy answer + source URL])

    style Q fill:#4A90D9,color:#fff
    style TR fill:#27AE60,color:#fff

Policy documents are versioned and stored in S3. The OpenSearch index carries a doc_version field. Stale chunks (version mismatch) are excluded from retrieval.


Failure Modes & Mitigations

Failure Symptom Mitigation
RDS connection pool exhaustion Timeout on order_status Read replica for chatbot queries; pool size = 50
Cache stampede on stock All users query same hot item post-announcement Cache-aside with jitter + probabilistic early expiry
Price data stale after promotion starts Wrong price shown Promotion start triggers cache invalidation via EventBridge
Tracking API timeout (carrier) Shipping ETA unavailable Return last-known ETA with "freshness": "stale" flag
Policy doc version mismatch Outdated return policy in context Versioned index alias; old chunks auto-excluded after 24h

Interview Grill

Q: Why cache stock at only 10-second TTL? A: Inventory is a shared mutable resource. At 10M users, a popular volume going out of stock could trigger thousands of simultaneous checks. 10s TTL means at worst we show stock counts from 10s ago — acceptable for a chatbot (not a checkout system). At checkout, we bypass cache entirely.

Q: How do you prevent the LLM from hallucinating order details? A: Tool schema is strictly typed. The LLM is instructed: "Only report order information returned by get_order_status. Never infer." The tool result also includes "source": "aurora_live" so Claude knows it's authoritative. Tool descriptions use the word "NEVER infer."

Q: What if a user asks for an order that belongs to a different user? A: The user_id in the tool call is extracted from the JWT token, not trusted from the user's message. The MCP ignores any user_id in the query string and always uses the authenticated identity from the API Gateway authoriser context.

Q: Why not use DynamoDB instead of RDS for orders? A: Orders require multi-item ACID transactions (order + payment + inventory decrement). RDS Aurora Multi-AZ gives us that. DynamoDB transactions exist but are limited to 25 items and require careful schema design. The read replica satisfies chatbot read latency without competing with write-heavy order processing.