AI Safety Security Governance

OWASP GenAI Top-10 mapping, prompt-injection / system-prompt-leak / PII detection, action-confirmation gates, and the difference between guardrails (heuristic, fast) and red-team suites (DeepTeam / Promptfoo).

Interview talking points

What's the action-confirmation gate? Cart/order/account mutations never auto-execute on voice or chat — restate + confirm. Maps to LLM06 (Excessive Agency).
Heuristic vs learned guardrails. Regex catches the obvious; LLM-based detector catches the long tail; you need both, with the heuristic as the cheap first pass.
Cross-family judging for safety eval. Same as the quality eval — never same-family.
OWASP GenAI 2025 categories you've covered. LLM01 / LLM02 / LLM06 / LLM07 with regex; LLM03/04/05/08/09/10 are the Phase-7 list.

Files in this folder

File	Title
README.md	AI Safety, Security, and Governance

Back to the home page.