Zones of Distrust — AI Agent Security Architecture
Explore the seven-layer defense architecture for autonomous AI agents. Open-source, vendor-neutral, built for production.
Landscape — Where Do the Zones of Distrust Live?
ZoD integrates with the entire AI security landscape across five categories:
IDENTIFY: Threat Taxonomies
OWASP LLM Top 10, OWASP ASI, MITRE ATLAS, MITRE ATT&CK v18, Cisco AI Defense, CSA MAESTRO. Threat taxonomies tell you what can go wrong. ZoD turns each cataloged threat into enforceable defenses: screening rules (L2), policy constraints (L4), and detection signatures (L6).
GOVERN: Lifecycle & Compliance
Google SAIF, NIST AI RMF, EU AI Act, SOC 2, ISO 27001, ISO/IEC 42001, WHO Digital Health Guidelines, NIST 1800-39, Treasury FS AI RMF. Lifecycle frameworks tell you when to apply controls. ZoD provides the structure where requirements are implemented.
Treasury FS AI RMF: 230 control objectives for financial services. Treasury built the checklist. Vendors built the tools. Nobody built the skeleton. ZoD provides the enforcement architecture — how those controls are implemented with cryptographic validation.
DEPLOY: Point Solutions
Microsoft Entra Agent ID, CyberArk, Aembit, Zenity, Lakera, Cisco AI Defense, OpenAI Lockdown Mode, Palo Alto Unit 42. Point solutions are valuable but without architecture connecting them, bypassing one means the others don't compensate.
CONNECT: Discovery & Interop
OWASP ANS, Google A2A, Anthropic MCP, IBM ACP. Discovery protocols handle how agents find each other. ZoD handles what happens after.
VERIFY: Trust Infrastructure
X.509/PKI, OAuth 2.0/OIDC, mTLS, DID/VCs. Trust infrastructure provides the cryptographic primitives ZoD consumes.
Seven-Layer Architecture
Core design principle: Assume the agent is already compromised. Every layer functions even when the agent works against its own controls.
- L1 — OS Foundation: Agent identity, credential brokering, PKI infrastructure.
- L2 — Input Control: Adversarial screening and safety signal detection. Assumed bypassable.
- L3 — Cognitive Isolation: Reasoning separated from execution. The agent can think but cannot act.
- L4 — Request Validation (CA): Parameter-bound tokens, semantic policy enforcement, chain-of-custody validation.
- L5 — Execution: Isolated execution with immutable logging. The integrity channel.
- L6 — Continuous Monitoring: Behavioral baselines, drift detection, cross-layer correlation. External to the agent.
- L7 — Human Governance: Risk-weighted escalation, policy authority, break-glass procedures.
Attack Scenarios
Interactive attack chain analysis is available with JavaScript enabled. Scenarios include: Wire Transfer Fraud, AI Voice Deepfake Ransom, AI Companion Safety Gap, Memory Poisoning, Multi-Agent Privilege Escalation, Malicious MCP Tool, Living Off the AI, Insider Policy Abuse, Shadow AI Agent, Reasoning Chain Attack, and Distillation Extraction Attack.
View on GitHub | BluVi
© 2026 BluVi. Dual-licensed: CC BY 4.0 (specs/docs) and Apache 2.0 (code).