LOCAL PREVIEW View on GitHub

AI Safety, Security, and Governance

AIP-C01 Content Domain 3 Coverage

This folder is a new deep-dive pack for Content Domain 3: AI Safety, Security, and Governance. It is grounded in the same MangaAssist system used throughout this repository: an Amazon-style chatbot for the JP Manga storefront that supports product discovery, FAQ answers, order help, returns guidance, and human escalation.

The goal is not to repeat generic AWS service descriptions. Each document treats the skill as something we would actually implement for this chatbot, with:

  • a user story tied to a real owner
  • concrete production scenarios
  • design decisions using AWS-native services
  • measurable acceptance criteria
  • risks, tradeoffs, and interview-ready intuition

Grounding Assumptions

Everything in this pack assumes the architecture and product scope from:

This pack also complements the earlier deep-dive security material in Security-Privacy-Guardrails/README.md. That folder is topic-oriented. This one is exam-skill-oriented and explicitly maps to the Domain 3 skill statements.

Task-to-Folder Mapping

Task Folder What It Covers
3.1 01-input-output-safety-controls/ Harmful input filtering, safe output generation, hallucination controls, layered safety, adversarial defense
3.2 02-data-security-privacy-controls/ Protected FM environments, privacy-preserving interactions, anonymization and masking
3.3 03-ai-governance-compliance/ Compliance evidence, traceability, governance operating model, continuous controls
3.4 04-responsible-ai-principles/ Transparency, fairness, and policy-aligned FM behavior

Skill-to-File Mapping

Skill File Core Scenario
3.1.1 01-harmful-input-safety-systems.md Harmful or abusive user prompts entering the chat pipeline
3.1.2 02-harmful-output-safety-frameworks.md Preventing toxic, unsafe, or policy-violating model responses
3.1.3 03-accuracy-verification-hallucination-control.md Reducing hallucinations on catalog, policy, and shipping answers
3.1.4 04-defense-in-depth-safety-architecture.md Multi-layer safety stack across pre, in, and post generation stages
3.1.5 05-adversarial-threat-detection.md Prompt injection, jailbreaks, obfuscation, and adversarial testing
3.2.1 01-protected-ai-environments.md Private FM deployment for order and profile data
3.2.2 02-privacy-preserving-fm-interactions.md PII detection, output filtering, and retention control during chat
3.2.3 03-privacy-focused-ai-systems.md Preserving utility while masking or anonymizing user data
3.3.1 01-compliance-frameworks-for-fm-deployments.md Building audit-ready evidence for FM deployments
3.3.2 02-data-source-traceability.md Tracking which source fed which FM answer
3.3.3 03-organizational-governance-systems.md Org-wide review, escalation, and release controls for GenAI
3.3.4 04-continuous-monitoring-governance-controls.md Ongoing detection of misuse, drift, and policy violations
3.4.1 01-transparent-ai-systems.md Showing evidence, confidence, and reasoning traces responsibly
3.4.2 02-fairness-evaluations.md Evaluating uneven model behavior across user groups and languages
3.4.3 03-policy-compliant-ai-systems.md Enforcing responsible AI policies in runtime and deployment
  1. Start with Task 3.1 because it defines the runtime safety envelope.
  2. Move to Task 3.2 to protect the data that powers those flows.
  3. Read Task 3.3 to understand how the organization proves control to auditors and leadership.
  4. Finish with Task 3.4 to connect safety and compliance back to responsible AI behavior seen by customers.

What Good Looks Like Across Domain 3

  • Unsafe input is filtered without blocking legitimate shopping help.
  • Unsafe output is prevented before the customer ever sees it.
  • High-risk answers are grounded in approved sources and can abstain when confidence is low.
  • Sensitive data stays in tightly controlled environments with short retention and clear access boundaries.
  • Every important answer can be traced back to a source, a policy, and a model version.
  • Governance is operational, not decorative: releases, drift, and incidents all have owners.
  • Customers see explanations, evidence, and consistent policy behavior without exposing internal prompts.