AI Safety Security Governance
OWASP GenAI Top-10 mapping, prompt-injection / system-prompt-leak / PII detection, action-confirmation gates, and the difference between guardrails (heuristic, fast) and red-team suites (DeepTeam / Promptfoo).
Interview talking points
- What's the action-confirmation gate? Cart/order/account mutations never auto-execute on voice or chat — restate + confirm. Maps to LLM06 (Excessive Agency).
- Heuristic vs learned guardrails. Regex catches the obvious; LLM-based detector catches the long tail; you need both, with the heuristic as the cheap first pass.
- Cross-family judging for safety eval. Same as the quality eval — never same-family.
- OWASP GenAI 2025 categories you've covered. LLM01 / LLM02 / LLM06 / LLM07 with regex; LLM03/04/05/08/09/10 are the Phase-7 list.
Files in this folder
| File | Title |
|---|---|
| README.md | AI Safety, Security, and Governance |
Back to the home page.