2. PII Protection & Data Privacy

MangaAssist sits in an awkward part of the risk surface: it is both a shopping assistant and a conversational system. Users voluntarily paste names, phone numbers, addresses, order IDs, gift card codes, and account details into natural-language chat. At the same time, the system itself can generate or echo sensitive data in responses. That means privacy cannot be treated as a storage-only problem or a prompt-only problem. It has to be enforced across ingestion, orchestration, model prompting, response filtering, logging, analytics, retention, and deletion.

This document goes deeper than the base privacy narrative and answers five practical engineering questions:

Where can PII enter the system and where can it leak?
How is PII detected with enough recall without breaking manga-domain queries?
How is the confidence score calculated and used for routing?
How do the storage and deletion flows prove privacy controls beyond the hot path?
What follow-up questions should you expect in a design review or interview?

Why This Matters for MangaAssist

MangaAssist processes sensitive data across nearly every user journey:

Data Type	Where It Appears	Why It Matters
Customer name	Account help, order lookup, shipping status	Identity exposure, unwanted profiling
Email address	Order support, account recovery questions	Phishing, spam targeting
Shipping address	Delivery estimate, returns, account help	Physical safety, direct privacy risk
Phone number	Tracking and customer support	SIM-swapping and unwanted contact
Amazon customer ID	Session context, account support	Account takeover vector
Order history	Recommendations, return support	Behavioral profiling
Payment references	Last-4 digits, gift card references	Fraud enablement
Browsing behavior	Personalization and ranking	Preference profiling

The hard part is not only detection. The hard part is that the same string can be either sensitive or harmless depending on context:

Gojo Satoru can be a fictional character or a real person name
123-4567 can be a postal fragment, phone fragment, or noise
B0XXXXXXX looks like an ASIN and should not be redacted as PII
TBA123456789012 is not a person identifier, but it is still sensitive operational data

The privacy system therefore needs:

strong structured detection
domain-aware unstructured detection
policy-aware routing
data minimization at every persistence boundary
deletion that covers both primary and derived data

Design Goals

Detect sensitive data before any non-essential persistence.
Separate detection from authorization: something can still be PII even if a downstream service is allowed to see it.
Keep the hot path deterministic and low latency.
Preserve shopping utility by avoiding false positives on manga titles, character names, ASINs, and catalog terms.
Make privacy controls auditable: what was detected, what was redacted, what was stored, what was deleted, and when.
Treat guest sessions more conservatively because their data is harder to attribute later.

Threat Model and Trust Boundaries

Main Privacy Failure Modes

Failure Mode	Example	Impact	First Control
User volunteers PII in chat	"My email is alex@example.com"	Raw PII reaches logs, history, analytics	Pre-logging PII scan
Model echoes sensitive data	FM repeats shipping address from context	Unauthorized disclosure in response	Post-generation PII response filter
Model hallucinates plausible PII	Fake email or phone number in prose	User sees fabricated personal data	Response-side regex + NER
False positive on manga entities	`Gojo Satoru` becomes `[NAME_REDACTED]`	Broken recommendations and trust loss	Allowlist + context scorer
Locale miss	JP or DE formats not caught	PII leak due to detector blind spot	Locale-aware regex + multilingual NER
Derived data not deleted	Training export still contains `customer_id`	GDPR/CCPA non-compliance	Data lineage map + deletion workflow
Guest data retention too long	Unattributable guest PII stored for 24h	Higher privacy risk, weak deletion path	Lower threshold + shorter TTL

Trust Boundary View

flowchart TB
    subgraph Untrusted["Untrusted / User-Controlled"]
        User[User message]
        History[Conversation text]
        APIText[API payload text fields]
    end

    subgraph Controlled["Controlled Decision Layer"]
        Normalize[Normalizer + locale resolver]
        Detect[PII detection pipeline]
        Policy[Policy and authorization engine]
        Redact[Redaction and masking engine]
    end

    subgraph Sensitive["Sensitive Runtime Zone"]
        Orch[Orchestrator request context]
        FM[Foundation model prompt]
        Ship[Shipping API]
        Order[Order API]
    end

    subgraph Persistent["Persistent Stores"]
        Logs[Redacted logs]
        DDB[Redacted session history]
        Analytics[Anonymized analytics]
        S3[Archives and audit evidence]
    end

    User --> Normalize
    History --> Normalize
    APIText --> Normalize
    Normalize --> Detect
    Detect --> Policy
    Policy --> Redact
    Policy --> Orch
    Orch --> FM
    Orch --> Ship
    Orch --> Order
    Redact --> Logs
    Redact --> DDB
    Redact --> Analytics
    DDB --> S3

Key rule: raw user text is allowed to exist in memory for the minimum time needed to fulfill the request, but persistent systems should see only the redacted or policy-approved representation.

High-Level Design (HLD)

System Overview

flowchart LR
    User[Web / mobile client] --> Gateway[API Gateway + auth]
    Gateway --> Orch[Chat orchestrator]

    subgraph PII["PII Protection Layer"]
        Normalize[Text normalizer]
        Locale[Locale resolver]
        Regex[Regex scanner]
        NER[NER endpoint]
        Custom[Custom detectors]
        Merge[Overlap merger]
        Score[Confidence scorer]
        Policy[Authorization and action router]
        Redact[Redaction engine]
    end

    Orch --> Normalize
    Normalize --> Locale
    Locale --> Regex
    Locale --> NER
    Locale --> Custom
    Regex --> Merge
    NER --> Merge
    Custom --> Merge
    Merge --> Score
    Score --> Policy
    Policy --> Redact

    Policy -->|Ephemeral approved fields only| FM[Foundation model]
    Policy -->|Ephemeral approved fields only| Shipping[Shipping API]
    Policy -->|Ephemeral approved fields only| Orders[Order API]

    Redact --> DDB[DynamoDB session history]
    Redact --> Logs[CloudWatch application logs]
    Redact --> Analytics[Analytics stream]

    FM --> ResponsePII[Response PII filter]
    Shipping --> ResponsePII
    Orders --> ResponsePII
    ResponsePII --> UserResponse[User-visible response]

HLD Principles

Detection runs before persistence.
Authorization is explicit and separate from confidence.
Raw PII is never forwarded by default.
Response-side filtering exists because the FM can still generate sensitive data even if the input path is clean.
Audit and deletion are first-class parts of the design, not cleanup tasks.

End-to-End Dataflow

Inbound Request Dataflow

sequenceDiagram
    participant User
    participant Gateway as API Gateway
    participant Orch as Orchestrator
    participant Norm as Normalizer
    participant Scan as PII Scanner
    participant Policy as Policy Engine
    participant DDB as DynamoDB
    participant Log as Logs
    participant FM as Foundation Model
    participant API as Order/Shipping API

    User->>Gateway: Chat message with possible PII
    Gateway->>Orch: Auth context + message
    Orch->>Norm: Normalize for detection
    Norm->>Scan: Locale-aware text + offset map
    Scan->>Policy: Findings with confidence
    Policy->>Log: Persist redacted copy only
    Policy->>DDB: Store redacted conversation history
    Policy->>FM: Send minimized prompt
    Policy->>API: Send approved PII fields only if required
    FM-->>Orch: Response candidate
    API-->>Orch: Structured data

Outbound Response Dataflow

flowchart LR
    Candidate[FM or API-composed response] --> RespPII[Response-side PII scan]
    RespPII --> Guardrails[Remaining guardrails]
    Guardrails --> Deliver[Deliver to user]
    RespPII --> Audit[PII near-miss audit event]

Important point: input-side privacy controls do not eliminate the need for output-side privacy controls. The model can hallucinate emails, phone numbers, or addresses that were never present in the input.

Hot Path vs Async Path

flowchart TD
    subgraph HotPath["Synchronous hot path"]
        A[Normalize] --> B[Detect]
        B --> C[Score]
        C --> D[Policy route]
        D --> E[Redact and store safe copy]
        D --> F[Pass ephemeral approved fields]
    end

    subgraph AsyncPath["Asynchronous follow-up"]
        G[Review queue]
        H[Analytics anonymization]
        I[Audit aggregation]
        J[Allowlist refresh]
        K[Deletion evidence generation]
    end

    C --> G
    E --> H
    E --> I
    D --> K
    J --> B

This split matters because privacy logic that must prevent leakage belongs in the synchronous path. Optimization, aggregation, and human review belong in the async path.

PII Detection Architecture Deep Dive

Layer 0: Text Normalization and Locale Resolution

Before detection, the system normalizes text for scanners without losing the ability to redact the original string correctly.

The implementation detail that matters is the offset map:

normalize Unicode for detectors
remove zero-width characters and repeated spacing
standardize separators where safe
preserve a mapping from normalized spans back to original spans

Without this, you can detect on normalized text but redact the wrong characters in the original message.

def normalize_for_detection(text: str) -> tuple[str, list[int]]:
    """
    Returns normalized text and an offset map where offset_map[i]
    points to the original character index for normalized position i.
    """
    normalized = []
    offset_map = []

    for index, char in enumerate(text):
        transformed = normalize_char(char)  # NFKC, zero-width removal, spacing cleanup
        for out_char in transformed:
            normalized.append(out_char)
            offset_map.append(index)

    return "".join(normalized), offset_map

Locale resolution uses:

authenticated user marketplace or country
shipping country if available
UI locale
detector hints from the text itself

If locale is ambiguous, the pipeline runs the safe union of the relevant locale patterns.

Layer 1: Regex-Based Pattern Detection

Regex is still the cheapest and most reliable detector for strongly structured fields.

PII_PATTERNS = {
    "email": r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b",
    "obfuscated_email": r"\b[A-Za-z0-9._%+-]+\s*(?:@|\[at\]|\(at\))\s*[A-Za-z0-9.-]+\s*(?:\.|\[dot\]|\(dot\))\s*[A-Za-z]{2,}\b",
    "us_phone": r"\b(?:\+1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b",
    "jp_phone": r"\b0\d{1,4}[-.\s]?\d{1,4}[-.\s]?\d{4}\b",
    "ssn": r"\b\d{3}[-.\s]?\d{2}[-.\s]?\d{4}\b",
    "credit_card": r"\b(?:\d{4}[-.\s]?){3}\d{4}\b",
    "us_zip": r"\b\d{5}(?:-\d{4})?\b",
    "jp_postal": r"\b\d{3}-\d{4}\b",
    "amazon_order_id": r"\b\d{3}-\d{7}-\d{7}\b",
}

def scan_regex(text: str) -> list[dict]:
    findings = []
    for pii_type, pattern in PII_PATTERNS.items():
        for match in re.finditer(pattern, text, flags=re.IGNORECASE):
            findings.append({
                "type": pii_type,
                "value": match.group(),
                "start": match.start(),
                "end": match.end(),
                "base_confidence": 0.95,
                "source": "regex",
            })
    return findings

Regex strengths:

sub-millisecond
deterministic
auditable
ideal for hot-path filtering

Regex weaknesses:

poor for names and free-form addresses
brittle against novel obfuscation unless continuously refreshed
no semantic understanding

Layer 2: NER-Based Entity Detection

NER catches the messy cases regex cannot handle:

names
addresses
organizations
location mentions that become sensitive in account context

NER_ENTITY_MAP = {
    "PERSON": "person_name",
    "ADDRESS": "address",
    "GPE": "location",
    "LOC": "location",
    "ORG": "organization",
    "JAPANESE_NAME": "person_name",
}

def scan_ner(text: str, locale: str) -> list[dict]:
    response = sagemaker_runtime.invoke_endpoint(
        EndpointName="pii-ner-endpoint",
        Body=json.dumps({"text": text, "locale": locale}),
        ContentType="application/json",
    )
    entities = json.loads(response["Body"].read())
    findings = []

    for entity in entities:
        if entity["label"] not in NER_ENTITY_MAP:
            continue
        if entity["score"] <= 0.70:
            continue

        findings.append({
            "type": NER_ENTITY_MAP[entity["label"]],
            "value": entity["text"],
            "start": entity["start"],
            "end": entity["end"],
            "base_confidence": entity["score"],
            "source": "ner",
        })
    return findings

Domain-specific requirement: a generic NER model will over-flag manga character names as real people. Fine-tuning reduces this, but in production it is still not enough by itself. The right solution is layered:

NER fine-tuning on manga/e-commerce data
character allowlist
context-aware score adjustments
monitoring of mid-confidence person_name findings

Layer 3: Custom Detectors

Some sensitive strings are domain-specific and need business logic, not just generic NLP.

CUSTOM_PATTERNS = {
    "amazon_customer_id": r"\b[A-Z0-9]{13,14}\b",
    "tracking_number": r"\b(?:1Z[A-Z0-9]{16}|TBA\d{12,})\b",
    "gift_card_code": r"\b[A-Z0-9]{4}-[A-Z0-9]{6}-[A-Z0-9]{4}\b",
    "asin": r"\bB0[A-Z0-9]{8}\b",
}

ACCOUNT_CONTEXT_KEYWORDS = {"account", "customer id", "order", "my account", "profile"}

def scan_custom(text: str) -> list[dict]:
    findings = []
    text_lower = text.lower()

    for pii_type, pattern in CUSTOM_PATTERNS.items():
        for match in re.finditer(pattern, text):
            base_confidence = 0.90

            if pii_type == "amazon_customer_id":
                if not any(keyword in text_lower for keyword in ACCOUNT_CONTEXT_KEYWORDS):
                    base_confidence = 0.40

            findings.append({
                "type": pii_type,
                "value": match.group(),
                "start": match.start(),
                "end": match.end(),
                "base_confidence": base_confidence,
                "source": "custom",
            })

    return findings

This is where many privacy systems fail. They either redact too broadly and destroy utility, or they under-classify business-specific identifiers because they do not look like classic PII.

Layer 4: Overlap Merging and Canonical Findings

The same span may be detected by multiple scanners. If we do not merge, we create double-redaction bugs and inconsistent routing.

def merge_overlapping_findings(findings: list[dict]) -> list[dict]:
    findings = sorted(findings, key=lambda item: (item["start"], item["end"]))
    merged = []

    for finding in findings:
        if not merged or finding["start"] > merged[-1]["end"]:
            merged.append({
                **finding,
                "sources": {finding["source"]},
            })
            continue

        previous = merged[-1]
        previous["end"] = max(previous["end"], finding["end"])
        previous["sources"].add(finding["source"])
        previous["base_confidence"] = max(previous["base_confidence"], finding["base_confidence"])

    return merged

Merge rule: keep the most conservative type and the highest base confidence, then let the scorer apply agreement bonuses.

Layer 5: Confidence Score Calculation

The score is not a single ML probability. It is a composite routing score:

final_confidence =
    clamp(
        base_confidence
        + detector_agreement_bonus
        + pii_context_bonus
        - non_pii_context_penalty
        + locale_support_bonus
        + manual_override
    )

Base Score Sources

Source	Base Score
Regex exact match	`0.95`
NER detection	`model score`
Custom detector	usually `0.90`, reduced to `0.40` for weak account context

Scoring Adjustments

Adjustment	Example	Delta
Multi-detector agreement	Regex and NER both support same span	`+0.05`
Strong PII context	`my name is`, `ship to`, `email me`, `account`	`+0.10`
Non-PII manga context	`recommend`, `series`, `volumes`, `character`	`-0.20` for `person_name` findings
Locale support bonus	Locale and format strongly align	`+0.03` to `+0.05`
Character allowlist override	`Gojo Satoru` from catalog list	force to `0.10`

Reference Implementation

PII_CONTEXT_BOOST = {"my name is", "ship to", "email me", "phone number", "account", "order"}
NON_PII_CONTEXT_HINTS = {"recommend", "series", "volumes", "character", "manga", "anime"}

def calculate_confidence(finding: dict, text: str, locale: str) -> float:
    score = finding["base_confidence"]
    text_lower = text.lower()

    if len(finding.get("sources", set())) >= 2:
        score += 0.05

    if any(keyword in text_lower for keyword in PII_CONTEXT_BOOST):
        score += 0.10

    if finding["type"] == "person_name" and any(keyword in text_lower for keyword in NON_PII_CONTEXT_HINTS):
        score -= 0.20

    if locale_supports_finding(locale, finding):
        score += 0.03

    if finding["type"] == "person_name" and is_character_name(finding["value"]):
        score = 0.10

    return max(0.0, min(score, 1.0))

Scoring Examples

Input	Detector Path	Final Score	Why
`alex@example.com`	Regex	`0.95`	Exact pattern match
`Ship to 123 Oak Street`	NER address `0.84` + context	`0.94`	Delivery context boosts address confidence
`Gojo Satoru` in recommendation query	NER `0.86` - manga hint - allowlist	`0.10`	Known character name, not customer PII
14-char account-like string without account context	Custom detector	`0.40`	Too ambiguous to redact inline

Important Distinction: Score vs Policy

The score answers "How likely is this span to be sensitive?".

Policy answers "What are we allowed to do with it?".

Those are different questions.

Example: a shipping address can have a confidence of 0.94 and still be allowed to reach the shipping API ephemerally. The score remains high because it is still PII. Policy decides that only one downstream component is allowed to see it.

Layer 6: Routing and Actions

flowchart TD
    Findings[Canonical findings] --> Score[Calculate final confidence]
    Score --> Auth{Authorized for this downstream use?}

    Auth -->|No| Route1{Score range}
    Auth -->|Yes| Route2{Score range}

    Route1 -->|>= 0.9| Redact[Full redact]
    Route1 -->|0.7 - 0.89| Mask[Mask + review queue]
    Route1 -->|0.5 - 0.69| Monitor[Log only]
    Route1 -->|< 0.5| Pass1[Pass]

    Route2 -->|>= 0.9| Ephemeral[Ephemeral pass-through to approved service only]
    Route2 -->|0.7 - 0.89| Mask2[Mask for storage, pass minimal approved field]
    Route2 -->|< 0.7| Pass2[Pass to approved service]

Default Routing Table

Final Score	Authenticated User	Guest User	Default Action
`>= 0.9`	Redact in persistence, allow ephemeral approved use only	Redact everywhere	Highest certainty
`0.7 - 0.89`	Mask for storage, review if ambiguous	Redact everywhere	Likely PII
`0.5 - 0.69`	Monitor only unless policy requires stronger handling	Redact for guest	Borderline
`< 0.5`	Pass	Pass	Likely false positive

Guest policy is stricter because unattributable guest PII is harder to govern later.

Low-Level Design (LLD)

Component Breakdown

classDiagram
    class PIIFinding {
        +str type
        +str value
        +int start
        +int end
        +str source
        +float base_confidence
        +float final_confidence
        +bool authorized
        +str action
    }

    class TextNormalizer {
        +normalize_for_detection(text)
    }

    class LocaleResolver {
        +resolve(session, text)
    }

    class RegexScanner {
        +scan(text, locale)
    }

    class NERClient {
        +scan(text, locale)
    }

    class CustomDetector {
        +scan(text, session)
    }

    class FindingMerger {
        +merge(findings)
    }

    class ConfidenceScorer {
        +score(finding, text, locale)
    }

    class PolicyEngine {
        +route(findings, session, intent)
    }

    class RedactionEngine {
        +apply(text, findings, offset_map)
    }

    class AuditPublisher {
        +publish(findings, decisions)
    }

    TextNormalizer --> LocaleResolver
    LocaleResolver --> RegexScanner
    LocaleResolver --> NERClient
    LocaleResolver --> CustomDetector
    RegexScanner --> FindingMerger
    NERClient --> FindingMerger
    CustomDetector --> FindingMerger
    FindingMerger --> ConfidenceScorer
    ConfidenceScorer --> PolicyEngine
    PolicyEngine --> RedactionEngine
    PolicyEngine --> AuditPublisher

Data Contracts

from dataclasses import dataclass, field

@dataclass
class PIIFinding:
    type: str
    value: str
    start: int
    end: int
    source: str
    base_confidence: float
    final_confidence: float = 0.0
    locale: str = "unknown"
    authorized: bool = False
    action: str = "pass"
    sources: set[str] = field(default_factory=set)

@dataclass
class PIIDecision:
    redacted_text: str
    approved_fields: dict
    findings: list[PIIFinding]
    audit_metadata: dict

Request-Side Orchestration

def process_inbound_message(
    raw_text: str,
    session: dict,
    intent: str,
    approved_fields_for_intent: set[str],
) -> PIIDecision:
    normalized_text, offset_map = normalize_for_detection(raw_text)
    locale = resolve_locale(session, normalized_text)

    findings = []
    findings.extend(scan_regex(normalized_text))
    findings.extend(scan_ner(normalized_text, locale))
    findings.extend(scan_custom(normalized_text))

    merged = merge_overlapping_findings(findings)

    for finding in merged:
        finding["final_confidence"] = calculate_confidence(finding, normalized_text, locale)
        finding["authorized"] = finding["type"] in approved_fields_for_intent

    routed = route_findings(merged, session, intent)
    redacted_text = apply_redactions(raw_text, routed, offset_map)

    store_redacted_history(session, redacted_text)
    emit_pii_audit_event(session, routed)

    return PIIDecision(
        redacted_text=redacted_text,
        approved_fields=extract_ephemeral_approved_fields(raw_text, routed, offset_map),
        findings=routed,
        audit_metadata=build_audit_metadata(routed),
    )

Storage Schema

Session History Record

Field	Example	Notes
`session_id`	`sess_123`	Partition key
`turn_id`	`17`	Sort key
`actor`	`user` or `assistant`	Who produced the turn
`redacted_text`	`Ship to [ADDRESS_REDACTED]`	Stored form
`pii_types_detected`	`["address"]`	Audit-friendly metadata
`pii_confidence_max`	`0.94`	Highest confidence in turn
`guest_mode`	`true`	Drives TTL and policy
`expires_at`	epoch time	DynamoDB TTL

PII Audit Event

Field	Example	Notes
`request_id`	`req_abc`	Trace correlation
`session_id_hash`	hash	PII-safe join key
`pii_findings_count`	`2`	Aggregate, not raw PII
`pii_types`	`["email", "address"]`	Audit taxonomy
`max_confidence`	`0.95`	Detection signal
`actions`	`["redact", "ephemeral_pass"]`	What happened
`authorized_fields`	`["address"]`	Approved downstream use
`model_version`	`pii-ner-v5`	Reproducibility

Latency Budget by Module

Module	Target Latency	Notes
Normalizer + locale resolver	`<1ms`	Pure in-process
Regex scanner	`<1ms`	Cheap deterministic rules
Custom detectors	`<1ms`	Mostly regex + context
NER endpoint	`3-8ms`	Cached real-time endpoint
Merger + scorer + routing	`<1ms`	In-process
Redaction	`<1ms`	Offset-map aware substring replacement
Total request-side PII stage	`5-12ms`	Fits inside chat latency budget

Implementation Components and Tools

Component / Tool	Why It Exists	Typical Use in This Design
Python `re`	Deterministic pattern matching	Email, phone, postal code, order ID, obfuscated email detection
SageMaker-hosted NER model	Low-latency entity extraction	Names, addresses, multilingual entities
Catalog-backed character allowlist	Domain false-positive control	Prevent character names from being redacted as customer names
DynamoDB	Short-lived session store	Redacted conversation history with TTL
CloudWatch Logs	Operational logging	Store redacted application events only
Analytics stream + warehouse	Product analytics	Consume anonymized events, never raw PII
S3 + lifecycle policy	Archive and evidence store	Encrypted archives, audit evidence, deletion artifacts
KMS envelope encryption	Field-level encryption where PII must be stored	Protect approved stored sensitive fields
Step Functions or equivalent workflow engine	Multi-system deletion orchestration	GDPR/CCPA erasure workflow with evidence chain
Async review queue	Human review of ambiguous detections	Medium-confidence or policy-sensitive cases

Scenario Deep Dives

Scenario 1: Manga Character Names Triggering PII Redaction

Context

After the NER-based detector went live, recommendation quality dropped because fictional character names were being classified as real-person PII.

User message:

I want manga with Gojo Satoru and Levi Ackerman in it

Observed failure:

NER labeled both names as PERSON
confidence landed in the 0.82 - 0.91 range
character names were redacted before recommendation retrieval
retrieval lost the most important entity

Failure Path

flowchart LR
    Query[Recommendation query with character names] --> NER[Generic PERSON detection]
    NER --> Score[High person-name confidence]
    Score --> Redact[Redact names]
    Redact --> Retrieve[Retriever sees degraded query]
    Retrieve --> BadRecs[Generic or irrelevant recommendations]

Root Cause

The model was right syntactically and wrong semantically. Manga character names look like real names, especially Japanese names. The detector lacked:

negative examples for fictional names
domain context such as recommend, series, volumes
a catalog-derived allowlist

Improved Design

flowchart LR
    Query[Recommendation query] --> NER[NER person-name detection]
    Query --> Context[Intent and context scorer]
    Query --> Allowlist[Character allowlist lookup]
    NER --> Merge[Merge]
    Context --> Merge
    Allowlist --> Merge
    Merge --> FinalScore[Final confidence]
    FinalScore -->|0.10 after override| Pass[Keep term in query]
    Pass --> Retrieve[Retriever sees original character name]

Implementation Changes

Added a catalog-derived CHARACTER_NAMES allowlist refreshed daily.
Added a negative penalty for recommendation-style context.
Fine-tuned the NER model on manga-domain negative examples.
Added a weekly audit for person_name findings in the 0.70 - 0.90 band.

Why the Hybrid Approach Won

Pure model fine-tuning helps but reacts slowly. Pure allowlists help quickly but are brittle. The hybrid approach:

gives immediate mitigation
reduces hot-path false positives
keeps improving as the model is retrained
creates explicit observability for uncertain names

Metric Signal

false positive rate on character-name queries: 40% -> 3%
recommendation relevance on character-name queries: +22%

Scenario 2: User Pasting Full Address for Delivery Estimate

Context

Users often paste their full address directly into chat instead of going through account settings.

Example:

Can you deliver to 123 Oak Street, Apt 4B, Springfield IL 62704?

The address is needed to compute delivery estimates, but it is not needed in:

logs
analytics
conversation history
model training exports

Incorrect Dataflow

flowchart LR
    User[User address in chat] --> Store1[Store raw message]
    Store1 --> Log[Logs]
    Store1 --> DDB[Session history]
    Store1 --> Analytics[Analytics]
    Store1 --> PII[PII detection later]
    PII --> Redact[Redact after the fact]

This is architecturally wrong because the leak already happened before redaction.

Correct Dataflow

sequenceDiagram
    participant User
    participant Gateway as API Gateway
    participant Scan as PII scanner
    participant Policy as Policy engine
    participant Log as Logs
    participant DDB as DynamoDB
    participant Orch as Orchestrator
    participant Ship as Shipping API

    User->>Gateway: "Deliver to 123 Oak Street, Springfield IL 62704?"
    Gateway->>Scan: Scan before any persistence
    Scan->>Policy: Address finding, confidence 0.94
    Policy->>Log: Write redacted text only
    Policy->>DDB: Store redacted text only
    Policy->>Orch: Pass original address ephemerally
    Orch->>Ship: Request delivery estimate with real address
    Ship-->>Orch: Delivery estimate
    Orch-->>User: "Estimated delivery is March 28"

Implementation Details That Matter

The original address only exists in orchestrator memory.
It is never written to logs, history, or analytics.
The session history stores the redacted version and the resulting answer.
Any asynchronous debug trace stores only the redacted text plus metadata such as pii_types=["address"].

Why Not Store the Encrypted Address for Convenience?

Because convenience is not a valid reason to expand the data footprint. Encryption reduces exposure after storage; it does not make unnecessary storage acceptable. The privacy-first design is to avoid storing it at all unless there is a clear product need.

Metric Signal

address leakage to non-essential systems: 100% -> 0%
no meaningful latency increase because the scan already existed; only the ordering changed

Context

A user requests erasure under GDPR Article 17. The hard part is not deleting the primary conversation row. The hard part is deleting or anonymizing every copy and derived form:

live session store
archives
logs
analytics
training exports
downstream evidence of deletion

HLD for Deletion Workflow

stateDiagram-v2
    [*] --> RequestReceived
    RequestReceived --> IdentityValidated
    IdentityValidated --> RegistryUpdated
    RegistryUpdated --> DeleteSessions
    RegistryUpdated --> DeleteArchives
    RegistryUpdated --> AnonymizeAnalytics
    RegistryUpdated --> PurgeExports
    RegistryUpdated --> ConfirmLogRetention
    DeleteSessions --> Evidence
    DeleteArchives --> Evidence
    AnonymizeAnalytics --> Evidence
    PurgeExports --> Evidence
    ConfirmLogRetention --> Evidence
    Evidence --> UserConfirmation
    UserConfirmation --> [*]

Detailed Dataflow

sequenceDiagram
    participant Support
    participant Workflow as Deletion orchestrator
    participant Registry as Deletion registry
    participant DDB as DynamoDB
    participant S3 as S3 archives
    participant WH as Analytics warehouse
    participant Export as Training export pipeline
    participant Audit as Audit bucket

    Support->>Workflow: Validated deletion request
    Workflow->>Registry: Write deletion request and request_id
    Workflow->>DDB: Delete conversation records by customer_id
    Workflow->>S3: Delete archived objects by tag or inventory lookup
    Workflow->>WH: Anonymize or delete attributable rows
    Workflow->>Export: Add customer_id to denylist for future exports
    Workflow->>Audit: Write evidence artifacts per step
    Workflow-->>Support: Completion summary with evidence IDs

Low-Level Implementation Notes

Deletion must be idempotent. Re-running the workflow should not fail if records are already gone.
Derived data should either be deleted or anonymized with a documented policy.
Backups need an explicit position: - short-lived immutable backups may be exempt from immediate mutation - restore workflows must replay the deletion registry before data becomes active
The deletion registry exists to prove compliance and prevent reintroduction into later training exports.

Why the Data Lineage Map Matters

If a system stores user-linked data and is missing from the lineage map, deletion is already broken. The lineage map is not documentation overhead; it is the control surface for legal erasure.

Metric Signal

deletion fulfillment time: ~3 days manual -> <4 hours automated
systems covered: 5/5
dry-run audit pass rate: 100%

Scenario 4: Guest User PII Boundary Enforcement

Context

Guest users should not need to share PII, but many still paste it into chat:

My email is alex@example.com, can you check my gift card status?

The system cannot safely treat guest PII like authenticated PII because:

there is no durable customer_id
there is no strong identity binding
later deletion is harder or impossible

Policy Comparison

flowchart TD
    Start[Incoming message] --> Mode{Authenticated?}

    Mode -->|Yes| Auth[Standard privacy policy]
    Mode -->|No| Guest[Guest privacy policy]

    Auth --> AuthScore[Use standard thresholds]
    Auth --> AuthTTL[24h session TTL]
    Auth --> AuthUse[Allow approved ephemeral PII use]

    Guest --> GuestScore[Lower redaction threshold]
    Guest --> GuestTTL[2h session TTL]
    Guest --> GuestUse[Never pass account-related PII through]
    Guest --> GuestPrompt[Prompt user to sign in]

Guest-Specific Rules

Redact from >= 0.5, not only from >= 0.9.
Never pass account-recovery-like fields to downstream account tools.
Keep guest TTL at 2 hours, not 24 hours.
Emit a user-facing message explaining that account assistance requires sign-in.

Why Not Block Any Guest Message That Contains PII?

Because some guest flows are still useful even when a small amount of PII appears accidentally. The better balance is:

redact aggressively
minimize retention
nudge toward authentication for account-specific help

That protects privacy without turning the guest experience into a wall of rejections.

Metric Signal

guest sessions containing stored PII: ~18% -> <1%
guest-to-auth conversion: +12%

Data Retention, Deletion, and Evidence Chain

Retention Architecture

flowchart TB
    subgraph SessionData["Session and interaction data"]
        Conv[Conversation history<br/>24h auth / 2h guest]
        Meta[Session metadata<br/>24h]
        Logs[Application logs<br/>30 days]
        Audit[Audit evidence<br/>1 year]
    end

    subgraph AnalyticsData["Derived data"]
        Events[Identifiable analytics<br/>90 days]
        Agg[Aggregated analytics<br/>Long-lived]
        Exports[Training exports<br/>Rolling export cycle]
    end

    subgraph Controls["Controls"]
        TTL[DynamoDB TTL]
        LC[S3 lifecycle rules]
        Anon[Warehouse anonymization job]
        Registry[Deletion registry]
    end

    Conv --> TTL
    Meta --> TTL
    Logs --> LC
    Audit --> LC
    Events --> Anon
    Exports --> Registry

Retention Table

Store	Retention	Deletion Mechanism	Notes
DynamoDB conversation history	24h authenticated, 2h guest	TTL + explicit delete	Stores redacted text only
Session metadata	24h	TTL	No raw PII by default
CloudWatch logs	30 days	Retention policy	Logs must already be redacted
S3 archives	90 days then lifecycle delete	Lifecycle + explicit delete by tag	Encrypted
Analytics warehouse	90 days identifiable then anonymized	ETL anonymization	Keep aggregate trends only
Training exports	Rolling cycle	Export filter + deletion registry	Deleted users excluded from future exports
Audit evidence	1 year	Lifecycle retention	Keeps proof, not raw deleted data

Evidence Chain for Deletion

The system should retain proof of action, not the deleted data itself. Evidence artifacts usually contain:

request ID
customer ID hash
systems targeted
timestamp per step
success or retry status
operator or workflow identity

That gives auditors a durable trail without preserving the original PII.

Monitoring, Alerting, and Testing

Key Metrics

Metric	Why It Matters	Example Alert
`pii_in_response_rate`	Sensitive data still reaching users	`>0.5%` of responses
`pii_near_miss_rate`	Model or tools are generating data guardrails must catch	sudden spike over baseline
`character_name_false_positive_rate`	Domain utility regression	`>5%`
`guest_pii_persistence_rate`	Guest privacy boundary drift	any sustained increase
`deletion_sla_hours`	Compliance risk	`>72h` or internal target breach
`pii_stage_p95_ms`	Hot-path latency regression	`>15ms`
`mid_confidence_review_volume`	Ambiguity trend	unusual jump suggests drift

Alert Design

Use alerts that distinguish true privacy incidents from noisy detector activity:

high severity: PII in final response, cross-user leakage, failed deletion workflow
medium severity: detector recall drop, guest-policy leakage, audit evidence gaps
low severity: rising false positives, latency regression, review backlog growth

Test Strategy

Test Layer	What It Covers	Example Cases
Unit tests	regex, scorer, routing	email, phone, character-name overrides
Integration tests	dataflow correctness	address should not appear in logs or history
Adversarial tests	obfuscation and prompt-induced PII	`john [at] gmail [dot] com`, zero-width chars
Multi-locale tests	locale-specific coverage	JP postal, DE address, US SSN
Regression tests	domain false positives	`Gojo Satoru`, `Attack on Titan`, ASINs
Canary monitoring	live safety before full rollout	compare near-miss rate and block rate

Deep-Dive Test Cases Worth Having

Obfuscated PII: alex [at] example [dot] com
Mixed script text: full-width digits, Japanese address fragments
Character name vs real customer name in similar syntax
Same PII passed through authorized and unauthorized intents
FM-generated fake contact details in response prose
Deletion request for a user with data across every storage layer

Failure Modes and Tradeoffs

Decision	What We Chose	Alternative	Upside	Downside
Detection timing	Scan before persistence	Scrub after storage	Prevents initial leak	Synchronous hot-path work
Confidence routing	Multi-tier thresholds	Single redact threshold	Better precision/recall balance	More policy complexity
Name handling	Allowlist + context + NER tuning	NER only	Fast false-positive reduction	Allowlist maintenance
Guest policy	Lower threshold + short TTL	Same as authenticated	Better privacy for unattributable data	Some guest flows become more limited
Address handling	Ephemeral pass-through only	Store encrypted for reuse	Strong minimization	User may need to re-enter later
Deletion workflow	Central orchestrator + registry	Manual deletions	Audit-ready and repeatable	Engineering investment

Residual Risks

Even after all of the above, some risks remain:

semantic PII that looks harmless to detectors
future locale formats not yet covered
prompt changes that induce the FM to generate contact-like strings
derived datasets accidentally created outside the lineage map

The mitigation pattern is not "build one better detector." It is:

layered controls
observability
explicit ownership of every persistence boundary
continuous negative testing

Follow-Up Questions and Deep-Dive Answers

These are the questions a strong interviewer, reviewer, or principal engineer will ask after the base privacy design looks reasonable.

Q1. Why is authorization kept separate from the confidence score instead of blending them into one number?

Answer: Because they represent different facts. Confidence is a classification estimate: "How likely is this span to be sensitive?" Authorization is a policy decision: "Is this component allowed to see that sensitive field for this use case?" If you blend them, you create ambiguous behavior. A shipping address used for a delivery estimate might be highly sensitive with confidence 0.94, but still authorized for one API call. Lowering the score just because it is authorized would hide the fact that it is sensitive, break audit quality, and make downstream analysis harder. The clean design is to keep the high score, route storage through redaction, and allow only one ephemeral approved use.

Q2. Why not replace regex, custom rules, and allowlists with one stronger LLM or one larger NER model?

Answer: Because the job is not only recognition accuracy. It is low-latency, deterministic enforcement. Regex and rules are cheap, explainable, and reliable for structured patterns. The NER model handles names and addresses, but it is probabilistic and domain-sensitive. Allowlists handle known fictional entities that would otherwise create repeated false positives. A single-model design looks elegant but performs poorly on three dimensions that matter in production: latency, debuggability, and exact control over business-specific identifiers. Layering is not technical debt here. It is the operationally correct architecture.

Q3. How do you prevent overlap bugs such as double redaction or conflicting actions for the same text span?

Answer: The pipeline must canonicalize findings before routing. That means sorting findings by span, merging overlapping ranges, carrying forward the strongest conservative type, and recording all supporting sources for agreement bonuses and later audit. Routing decisions happen only on canonical findings, never on raw detector outputs. Without this step, one scanner can say mask, another can say redact, and the rendering layer ends up corrupting the message or applying inconsistent policy. The overlap merger is a low-complexity component, but it is essential.

Q4. How do you scale NER without turning privacy into the latency bottleneck?

Answer: Keep as much as possible in-process and deterministic, and reserve the model for what only the model can do. Regex and custom rules should handle all structured cases. NER should run on a warm real-time endpoint with strict entity filtering and a narrow label set. In addition, monitor the distribution of text lengths and consider early exits for short strings that only contain structured patterns already handled by regex. The latency budget should be explicit and enforced, because privacy that only works at low traffic is not production privacy.

Q5. How do you know a privacy regression is real and not just a change in user behavior?

Answer: You need both rate metrics and trace-level evidence. A metric such as pii_near_miss_rate tells you that the model or downstream tools are generating more sensitive content. But that by itself does not prove leakage. Pair it with trace sampling: inspect what was detected, where it was blocked, and whether the final delivered response still contained sensitive strings. For false positives, inspect mid-confidence review volume and domain-specific error slices such as character-name queries. The combination of aggregate metrics and trace slices is what distinguishes a real regression from traffic mix noise.

Q6. How would you test obfuscated PII and multi-locale inputs without breaking legitimate Japanese text?

Answer: Normalize for detectors, not for the final user-visible text. That means the detector path can collapse zero-width characters, standardize separators, and map homoglyphs where appropriate, while the original text and an offset map are preserved for precise redaction. The test suite should include multilingual examples, full-width digits, Japanese addresses, German street formats, obfuscated emails, and benign manga terms that should survive intact. The goal is not to normalize everything globally. The goal is to normalize enough for classification while preserving user fidelity and exact span control.

Q7. How do you delete data from analytics or training systems that are not simple key-value stores?

Answer: You need a documented policy per system. For analytics, row-level deletion is ideal when practical; otherwise anonymization can be acceptable if it truly breaks attribution and is documented in policy. For training exports, the deletion registry is critical. Once a user is deleted, future export jobs must exclude that identifier. If the training artifact has already been produced, you need a purge or regeneration rule. The key is to decide this upfront and encode it into the lineage map. If a system cannot answer "How do we delete or de-identify this user's data?" it is not production-ready.

Q8. Why keep a deletion registry for one year if the user asked to be deleted?

Answer: Because the registry is not the user's data in the product sense; it is compliance control metadata. Its purpose is to prevent reintroduction and to prove that a request was processed. The registry should contain the minimum necessary form, typically a hashed customer identifier, request metadata, timestamps, and outcome status. It should not contain the deleted content itself. This is an example of the difference between retaining business data and retaining control-plane evidence.

Q9. Why not forbid all guest messages that contain PII instead of using a lower threshold and shorter TTL?

Answer: Because privacy and usability both matter. Guests still ask legitimate shopping questions, and many paste unnecessary personal details accidentally. A hard-block-only design causes friction without improving security proportionally. The better design is to redact aggressively, keep retention minimal, restrict downstream account actions, and steer the user toward sign-in when account-specific help is needed. That preserves privacy while still allowing useful guest interactions like catalog discovery or general shipping policy questions.

Q10. What prevents the character allowlist from becoming a bypass where attackers choose names that look like allowed content?

Answer: The allowlist should be scoped narrowly and used only as one signal, not as unconditional trust. It should be derived from the catalog, refreshed automatically, and applied primarily to person_name findings in recommendation-style contexts. If the same string appears in account or shipping context, the context boost should outweigh the content hint and the finding should stay sensitive. Also monitor collisions: if a real customer name overlaps a popular character name, the audit and review path should surface that pattern. The point is to reduce domain false positives, not to create a universal privacy exemption.

Q11. If MangaAssist adds voice input with streaming transcripts, what changes in the privacy architecture?

Answer: The main change is granularity. Detection can no longer wait for the full message; it has to operate on partial transcript windows while still supporting correction as ASR hypotheses stabilize. That means temporary findings may need to be revised, and the UI should avoid exposing raw partial transcripts to logs. The architecture remains the same conceptually: detection before persistence, ephemeral approved use, redacted storage, response-side filtering. But the implementation gets more complex because timing and transcript revision become part of the privacy surface.

Q12. What is the hardest residual failure mode even after implementing all of this?

Answer: The hardest residual risk is semantically sensitive data that does not look like classic PII and only becomes risky when combined across turns or systems. For example, a sequence of benign-seeming utterances can reveal enough to identify a person or infer account ownership. That is why the long-term control is not only span detection. It is also session-level policy, bounded context windows, strong access control to downstream tools, and careful decisions about what history is retained at all.

Key Lessons

Privacy controls must run before persistence, not after.
Confidence-based routing is useful only if it is separated from authorization policy.
Domain false positives are a product risk, not just an ML annoyance.
Output-side privacy filtering is mandatory because the FM can still generate sensitive content.
Deletion is an architecture capability, not a support-team procedure.
Guest sessions deserve stricter defaults because their data is harder to govern later.
The strongest privacy systems are layered, observable, and explicit about every persistence boundary.

Cross-References

PII classification tables: 12-security-privacy.md
Guardrails pipeline and response filtering: 03-guardrails-pipeline-deep-dive.md
Encryption and field-level protection: 08-encryption-key-management.md
Incident response and forensic handling: 05-incident-response-forensics.md
Security interview follow-up bank: 09-interview-scenarios.md
Prompt-level hardening: Prompt-Engineering/05-guardrails-and-prompt-hardening.md
Scenario follow-up packs for this document: 02-pii-protection-data-privacy/README.md

2. PII Protection & Data Privacy

Why This Matters for MangaAssist

Design Goals

Threat Model and Trust Boundaries

Main Privacy Failure Modes

Trust Boundary View

High-Level Design (HLD)

System Overview

HLD Principles

End-to-End Dataflow

Inbound Request Dataflow

Outbound Response Dataflow

Hot Path vs Async Path

PII Detection Architecture Deep Dive

Layer 0: Text Normalization and Locale Resolution

Layer 1: Regex-Based Pattern Detection

Layer 2: NER-Based Entity Detection

Layer 3: Custom Detectors

Layer 4: Overlap Merging and Canonical Findings

Layer 5: Confidence Score Calculation

Base Score Sources

Scoring Adjustments

Reference Implementation

Scoring Examples

Important Distinction: Score vs Policy

Layer 6: Routing and Actions

Default Routing Table

Low-Level Design (LLD)

Component Breakdown

Data Contracts

Request-Side Orchestration

Storage Schema

Session History Record

PII Audit Event

Latency Budget by Module

Implementation Components and Tools

Scenario Deep Dives

Scenario 1: Manga Character Names Triggering PII Redaction

Context

Failure Path

Root Cause

Improved Design

Implementation Changes

Why the Hybrid Approach Won

Metric Signal

Scenario 2: User Pasting Full Address for Delivery Estimate

Context

Incorrect Dataflow

Correct Dataflow

Implementation Details That Matter

Why Not Store the Encrypted Address for Convenience?

Metric Signal

Scenario 3: GDPR Right-to-Deletion Request

Context

HLD for Deletion Workflow

Detailed Dataflow

Low-Level Implementation Notes

Why the Data Lineage Map Matters

Metric Signal

Scenario 4: Guest User PII Boundary Enforcement

Context

Policy Comparison

Guest-Specific Rules

Why Not Block Any Guest Message That Contains PII?

Metric Signal

Data Retention, Deletion, and Evidence Chain

Retention Architecture

Retention Table

Evidence Chain for Deletion

Monitoring, Alerting, and Testing

Key Metrics

Alert Design

Test Strategy

Deep-Dive Test Cases Worth Having

Failure Modes and Tradeoffs

Residual Risks

Follow-Up Questions and Deep-Dive Answers

Q1. Why is authorization kept separate from the confidence score instead of blending them into one number?

Q2. Why not replace regex, custom rules, and allowlists with one stronger LLM or one larger NER model?

Q3. How do you prevent overlap bugs such as double redaction or conflicting actions for the same text span?