LOCAL PREVIEW View on GitHub

Test Report — MangaAssist (Streamlit + 3D + ACP)

Date: 2026-04-30 Coverage: 110 tests across 4 test layers, 100% pass after one bug fix.

Summary

Layer Tool Tests Pass Fail Notes
Frontend (3D, browser) Playwright (Chromium) 56 56 0 Headless, real WebGL via SwiftShader
Backend (Python smoke) smoke_test.py 14 14 0 Catalog, MCP, guardrails, eval, atlas, traces
Backend (game prototypes) smoke_game_prototypes.py 19 19 0 Atlas + Pokédex flows, edge cases
API (FastAPI ACP) api_tests_acp.py 21 21 0 Discovery, products, cart, checkout, validation
Total 110 110 0

Bug found and fixed during testing

File Bug Fix
mangaassist_3d/src/scenes/Lobby.tsx "First Step" trophy auto-unlocks on player spawn (player spawns at (0,0,4) so Math.hypot(0,4) = 4 > 1.5) Track initial position via useRef; only fire after distanceTo(spawn) > 1.5
mangaassist_3d/src/components/Player.tsx + App.tsx V-key view toggle had stale closure inside R3F's async render loop — second press was no-op Move global key handlers (V, B) to App.tsx outside the Canvas; use functional setView(prev => …)

Layer 1: Playwright (frontend, 3D)

Config: playwright.config.ts — Chromium with --use-gl=angle --use-angle=swiftshader for real WebGL in headless mode. webServer block auto-starts npm run preview on port 4173 if not already running.

Spec Tests What it covers
01-smoke.spec.ts 6 Page title, start overlay visible/dismisses, canvas dimensions, WebGL context exists, no console errors
02-hud.spec.ts 7 Scene title, control hints (W/A/S/D/Shift/Space/V/B), top-right counters, trophy counter, nav buttons, crosshair only in FP
03-routing.spec.ts 10 Atlas/Pokédex button nav, Lobby return, B key, V key toggle, "current scene" button hiding
04-persistence.spec.ts 6 Clean state shows zeros, seeded loved/deck/trophies reflected in HUD, persists across reload, malformed JSON ignored
05-detail-card.spec.ts 4 Chat panel scoped to MangaBot, detail modal not present by default, Pokédex deck pedestal renders, Atlas canvas interactive
06-visual.spec.ts 3 Visual-regression screenshot baselines for Lobby, Atlas, Pokédex (8% pixel-ratio tolerance)
07-responsive.spec.ts 8 1920×1080, 1366×768, 1024×768, 768×600 — HUD elements visible + within viewport bounds
08-keyboard.spec.ts 6 V toggle (TP↔FP), B key behavior, WASD no-throw, Shift+W (run) no-throw, Space (jump) no-throw, unrelated keys ignored
09-accessibility.spec.ts 6 <html lang>, favicon link, viewport meta, button text content present, HUD pointer-events policy, start overlay doesn't trap input

Run command:

cd mangaassist_3d
npx playwright test

Run time: ~2 min 22s for the full suite. HTML report is generated under mangaassist_3d/playwright-report/.

Layer 2: Streamlit smoke (backend)

streamlit_app/scripts/smoke_test.py — 14 steps:

[PASS] seed catalog (force=True)
[PASS] catalog search filters
[PASS] mcp tools direct call
[PASS] guardrails red-team suite
[PASS] visual index (hash backend) + search
[PASS] embedding atlas build
[PASS] trace_store roundtrip
[PASS] eval golden-set load
[PASS] compile-check every page + surface
[PASS] orchestrator action-intent regex coverage
[PASS] anime->manga reading-order lookup (curated graph)
[PASS] affiliate URL builder (no creds = search URL)
[PASS] acceptance-rate funnel (shown + clicked)
[PASS] orchestrator spoiler-policy switch is wired

Coverage is the underlying business logic: catalog, MCP tools (anime↔manga, recommend, policy, cart), guardrails (red-team + PII), embeddings, atlas projection, trace store CRUD, eval golden-set load, all-page compile, orchestrator's action-intent regex, affiliate URL builder, acceptance-rate funnel, spoiler policy.

Layer 3: Game-prototype deep smoke (backend)

streamlit_app/scripts/smoke_game_prototypes.py — 19 steps:

[PASS] clean prior test state
[PASS] atlas page renders without error and seeds 3 starter tiles
[PASS] atlas: clicking Loved on a starter tile unlocks first_loved trophy
[PASS] atlas: + Deck button adds card and persists
[PASS] atlas: scout_suggest changes after revealing the suggested tile
[PASS] pokedex page renders with state shared from atlas
[PASS] pokedex: filling deck to 3 cards unlocks deck_built trophy
[PASS] pokedex: filtering 'Loved' shows only the loved card
[PASS] recommend_from_deck: empty deck -> empty recs
[PASS] recommend_from_deck: nonexistent sku ids -> empty recs (no crash)
[PASS] DECK_MAX cap is enforced
[PASS] invalid status raises ValueError
[PASS] reset_user wipes everything
[PASS] atlas page renders cleanly with EMPTY state (post-reset)
[PASS] atlas page renders with FULL state (every SKU touched)
[PASS] pokedex renders with FULL deck (5 cards)
[PASS] trophies are idempotent — unlocking twice doesn't duplicate
[PASS] safe_page_link: works without ScriptRunContext
[PASS] final cleanup

These tests exercise the Streamlit pages in headless AppTest mode and verify deep behaviors: trophy logic, deck cap enforcement, idempotency, edge-case state (empty / full atlas), invalid input handling, full reset flows.

Layer 4: API tests (FastAPI ACP server)

streamlit_app/scripts/api_tests_acp.py — 21 steps against the live ACP server on http://localhost:8000:

[PASS] healthz returns 200 + catalog size > 0
[PASS] discovery manifest has required shape
[PASS] GET /v1/products returns 12 by default
[PASS] GET /v1/products?q=demon filters by query
[PASS] GET /v1/products?completed_only=true respects flag
[PASS] GET /v1/products/{sku_id}: 200 for known
[PASS] GET /v1/products/{sku_id}: 404 for unknown
[PASS] POST /v1/cart creates a cart with computed subtotal
[PASS] POST /v1/cart with multiple items computes subtotal correctly
[PASS] POST /v1/cart with unknown sku returns 404
[PASS] GET /v1/cart/{cart_id} returns persisted cart
[PASS] GET /v1/cart/{unknown} returns 404
[PASS] POST /v1/checkout requires explicit confirm=true
[PASS] POST /v1/checkout with confirm=true succeeds and redacts email
[PASS] POST /v1/checkout with unknown cart returns 404
[PASS] GET /v1/checkout/{id} returns the checkout record
[PASS] GET /v1/checkout/{unknown} returns 404
[PASS] error responses are JSON, not HTML
[PASS] POST /v1/cart with empty items list still succeeds with subtotal=0
[PASS] POST /v1/cart rejects qty < 1 (Pydantic validation)
[PASS] POST /v1/cart rejects qty > 99 (Pydantic validation)

Notably verified: - ACP/UCP discovery manifest at /.well-known/agent-commerce.json advertises both protocols (acp/0.1, ucp/0.1) - The voice-action safety rule from roadmap §6.4 (confirm: true required for checkout) is enforced — confirm: false returns 400 - PII redaction in checkout (alice@example.coma***@example.com) - Pydantic validation rejects qty < 1 and qty > 99 with HTTP 422

Start command:

python -m uvicorn streamlit_app.surfaces.acp_server:app --port 8000
python streamlit_app/scripts/api_tests_acp.py

What was NOT tested (and why)

Area Why skipped Suggested next step
Real R3F mesh raycasting (clicking a 3D card-disc with mouse coords) Camera-relative; hard to test deterministically without exposing test hooks Add data-testid attributes on critical 3D meshes via R3F's HTML hover helpers
Touch / mobile gestures The 3D app uses mouse-drag; no touch path implemented yet Add a virtual-joystick component, then test on --device "iPhone 14"
Audio (footstep, NPC voice) Not implemented Once added, use Playwright's audio API or media-stream interception
Streamlit page-level Playwright Streamlit's WebSocket-driven UI is awkward for Playwright; covered by AppTest instead If needed, use Streamlit's own --server.runOnSave with Playwright helpers
Performance / FPS in 3D Covered loosely by visual regression timing; no formal FPS test Add requestAnimationFrame polling test or use chrome.tracing
Load/stress testing on FastAPI One-shot API tests only locust or k6 against the ACP server
WebGL on SwiftShader vs real GPU Visual snapshots have 8% tolerance Run a --project firefox or --project webkit matrix to catch driver-specific renders

How to re-run everything

# Terminal 1 — keep the 3D preview server running
cd mangaassist_3d && npm run preview -- --port 4173

# Terminal 2 — keep the FastAPI ACP server running
python -m uvicorn streamlit_app.surfaces.acp_server:app --port 8000

# Terminal 3 — run all 4 layers
cd mangaassist_3d && npx playwright test                    # Layer 1: 56 tests
python streamlit_app/scripts/smoke_test.py                  # Layer 2: 14 tests
python streamlit_app/scripts/smoke_game_prototypes.py       # Layer 3: 19 tests
python streamlit_app/scripts/api_tests_acp.py               # Layer 4: 21 tests

Total runtime end-to-end: ~3.5 minutes.

Interpretation

The 3D app, the Streamlit pages, the shared game state, and the ACP server are all production-equivalent at the smoke-test level: every entry point is hit at least once, every error path verified, every persistence channel roundtripped, every safety rule (voice-action confirm, PII redaction, deck cap, qty validation) enforced.

The two real bugs caught in this session — the auto-unlocking First Step trophy and the stale-closure V-key toggle — were both state-management edge cases that wouldn't show up under casual play but would surface within minutes once a user explored. They're worth catching now.

What this report does not establish: visual fidelity (screenshots have a generous 8% tolerance), performance under real GPU pressure, behavior on second-tier browsers, multi-session concurrency for the ACP server. Those are the next layer of test investment.