What is Zones of Distrust (ZoD)?

Zones of Distrust is BluVi's open-source commitment to protecting humans. It's an AI security and safety architecture that provides seven-layer defense-in-depth for autonomous AI agents — because bad actors exist, unintended consequences happen, and guardrails aren't optional.

How is ZoD different from existing AI security frameworks?

Existing approaches (OWASP, MITRE, NIST, Cisco AI Defense, Microsoft Entra) catalog threats, deploy guardrails, and monitor agent behavior, but still trust the agent to cooperate with their security controls. ZoD architecturally separates reasoning from execution so a compromised agent cannot access or circumvent its own constraints.

Does ZoD replace Zero Trust or OWASP?

No. ZoD complements existing frameworks. It is the enforcement layer those frameworks assume exists but never define. Use OWASP for threat identification, NIST for governance, and ZoD for architectural enforcement.

Does ZoD comply with EU AI Act requirements?

Yes. ZoD maps to EU AI Act Article 14 (human oversight) and Article 26 (deployer obligations). L7 provides risk-weighted escalation and multi-reviewer verification for high-risk systems. L5 provides transparency through immutable logging with decision context for explainability. L3 provides robustness through cognitive isolation.

What is the current status of ZoD?

ZoD is in Open RFC status (v0.9) as of February 2026. The architecture specification and whitepaper are in draft, and the project is accepting contributions from security practitioners, AI researchers, and adversarial thinkers.

Who created Zones of Distrust?

ZoD was created by BluVi and is stewarded as an open-source project. It is dual-licensed under CC BY 4.0 for specifications and Apache 2.0 for code.

What are the seven layers of Zones of Distrust?

The seven layers are: L1 OS Foundation (agent identity and process isolation), L2 Input Control (adversarial screening), L3 Cognitive Isolation (reasoning separated from execution), L4 Request Validation CA (independent semantic policy and token binding), L5 Execution (isolated action execution with immutable logging), L6 Continuous Monitoring (behavioral baselining and drift detection), and L7 Human Governance (risk-weighted escalation and policy control).

Can ZoD be implemented with existing infrastructure?

Yes. The core enforcement boundary (Layers 3-5) can be implemented today using containers, policy engines, and token-gated execution proxies. A Kubernetes reference pattern is documented in the implementation guide. Full Layer 1 OS-native capabilities require platform vendor evolution.

Zones of Distrust (ZoD) — Open-Source AI Security & Safety Architecture

THE PROBLEM

AI agents can reason, plan, and act.
When one gets compromised, it doesn't
just leak data — it takes actions.

Sends money. Changes systems. Makes decisions. All with its own authorized permissions, at machine speed, across every connected system.

Traditional compromise = data loss. Agent compromise = autonomous action at scale using its own authorized permissions. Unauthorized transactions, privilege escalation through persuasion, cascading failures across connected systems.

Want to see this in action? Real attack scenarios showing where current approaches stop and what happens without enforcement.

Explore attack scenarios →

THE GAP

The architecture wasn't built for
an AI agent that is compromised.

OWASP, MITRE, NIST, Cisco, and the major vendors catalog threats, deploy guardrails, and monitor agent behavior.

They still trust the agent to cooperate with their own security controls.

Capability Traditional
Zero Trust Prompt
Filters Credential
Proxy (IAM) Agent Identity
Products ZoD

Pre-reasoning input screening — ✓ — — ✓

Reasoning / execution separation — — — ~ ✓

Short-lived credential management ~ — ✓ ✓ ✓

Agent identity and lifecycle — — ~ ✓ ✓

Continuous behavioral monitoring ~ — — ~ ✓

Risk-weighted human escalation ~ — — — ✓

Insider / policy-author defense ~ — — ~ ✓

Parameter-bound token enforcement — — ~ — ✓

Semantic intent policy enforcement — — — — ✓

Memory audit (external, systematic) — — — — ✓

Behavioral baseline integrity — — — — ✓

Immutable integrity signal channel — — — — ✓

Cross-layer correlation detection — — — — ✓

Multi-agent chain-of-custody — — — ~ ✓

Formal degradation model — — — — ✓

✓ = core capability ~ = partial/ad-hoc — = absent | Source: ZoD Cross-Reference Mapping

See where ZoD enforcement begins →

THE ARCHITECTURE

The Seven-Layer Architecture

Because the agent that reasons should never be the process that executes privileged actions. Each layer assumes the layer above it has already been compromised.

OS Foundation

Attacker targets agent credentials and process isolation

Agent identity, process isolation, credential brokering, model provenance attestation. The agent never sees raw secrets — credentials are brokered at the OS level.

Input Control

Attacker poisons input — emails, docs, API responses, prompt injection, context window manipulation

Adversarial screening for prompt injection, persuasion, context window manipulation, and embedded instructions. But screening isn't perfect — so every layer below assumes it already failed.

Cognitive Isolation

Compromised agent tries to execute privileged actions directly

Agent can reason but cannot act. Reasoning and execution are separate processes with a hard boundary. All outputs are proposals, not commands.

Request Validation (CA)

Agent crafts a valid-looking request to misuse its own authorized tools

Independent Certificate Authority evaluates against scope, baseline, risk, and semantic intent. Token-bound, parameter-locked, single-use. The agent can't learn its own constraints.

Execution

Validated action gets tampered with during execution

Isolated process performs validated actions. Agent has no access. Cryptographically signed results with immutable provenance. Every action logged immutably.

Continuous Monitoring

Subtle behavioral drift goes undetected across sessions

Behavioral baselining, drift detection, memory audit, baseline integrity verification. No layer reports its own status — the integrity channel is independent and immutable.

Human Governance

High-risk action should require human judgment

Risk-weighted escalation — not optional oversight. High-risk actions physically cannot execute without human authorization. Policy flows down through the entire stack.

ZoD is an architectural model, not a latency guarantee. Implementations may collapse or optimize layers, but the enforcement boundaries remain conceptually distinct. The minimum safe configuration requires L3–L5 separation, token-bound execution, and immutable logging.

Ready to go deeper? Full specification — threat model, layer interactions, degradation modes, trust assumptions, and framework cross-reference mapping.

Read the draft spec on GitHub →

WHERE ZoD FITS

It completes the landscape.
It doesn't compete with it.

ZoD is not a replacement for OWASP, MITRE, or NIST. It's the enforcement layer those frameworks assume exists but never define.

IDENTIFY

Threat Taxonomies

OWASP, MITRE ATLAS, ATT&CK, Cisco, CSA MAESTRO

GOVERN

Lifecycle & Compliance

NIST AI RMF, ISO 42001, EU AI Act, SOC 2, Google SAIF

DEPLOY

Point Solutions

Entra Agent ID, Cisco AI Defense, CyberArk, Aembit, Zenity, Palo Alto

CONNECT

Discovery & Interop

Google A2A, Anthropic MCP, OWASP ANS, IETF WIMSE

VERIFY

Trust Infrastructure

SPIFFE/SPIRE, Sigstore, Agent PKI, Zero Trust

ENFORCE