Reference Architecture · Agentic Workflows

Multi-agent systems are identity systems.

The moment an AI system delegates to another agent, or invokes a tool on behalf of a user, governance stops being about model guardrails and starts being about identity and authorization. This architecture treats every agent as a first-class identity with scoped authority, bounded delegation, and a complete execution trace.

Core Thesis

Without identity for agents, delegation for tasks, and authorization for every transition, agentic AI is an uncontrolled execution chain. With them, it is a governable system.

Five operating principles

These principles determine whether a multi-agent system is governable or merely functional.

Every agent is a first-class identity

An agent is not application logic. It has an identity, an owner, a sensitivity tier, declared capabilities, and an authorization scope. Treating agents as code makes governance impossible. Treating them as identities makes governance routine.

Delegation is scoped, time-bound, revocable

When a user delegates a task to an agent, the agent receives bounded authority for that task, not blanket access. Scope is constrained by user entitlements, agent capabilities, declared purpose, resource classification, and expiration. Delegation tokens are auditable and revocable.

Agent-to-agent calls require explicit authorization

An agent calling another agent is not a function call. It is a privileged operation that requires authorization across the calling agent, target agent, delegated user authority, declared purpose, allowed capability, and transitive scope. Without this, multi-agent systems become uncontrolled execution chains.

Purpose travels with the call chain

The reason a task is being performed (incident response, customer support, reporting, analysis) is an authorization signal. The same agent may be allowed for one purpose and denied for another. Purpose is declared upstream and validated at every transition.

The full execution chain is traceable

From the initial user request through every agent, tool, retrieval, and model invocation, the system produces an immutable record. If an agentic system produces a harmful outcome, security teams must be able to reconstruct the path. Without traceability, agentic AI is operationally ungovernable.

Four authorization decisions

Every agent operation passes through four decisions. Each failure mode below corresponds to a missing decision.

Decision 1

Can THIS USER delegate to THIS AGENT?

User entitlements, declared purpose, agent capability scope, and risk signals are evaluated. A user may use an agent for one purpose but not another.

Decision 2

Can THIS AGENT use THIS CAPABILITY?

The agent's registered capability allow-list is checked against the requested tool, API, or sub-agent invocation. Capabilities are declared, not inferred.

Decision 3

Can THIS AGENT call THAT AGENT?

Agent-to-agent calls evaluate calling agent identity, target agent identity, delegated user authority, declared purpose, allowed transitive scope, and recursion depth.

Decision 4

Is the COMPLETE CHAIN authorized?

User, originating purpose, all agent transitions, all capabilities used, all data accessed, and runtime risk signals are evaluated as one decision. Hidden execution paths fail this check.

Decision Function

Allow = f(user, calling agent, target agent, capability, action, resource, purpose, transitive scope, context, risk)

Ten inputs. The decision must be reproducible, auditable, and evaluated at every transition, not just the entry point.

Architectural components

Six components. Each can be implemented in multiple ways; the requirement is that each role is fulfilled by something identifiable.

Agent Registry

The authoritative store for agent identity, owner, sensitivity tier, declared capabilities, approved tools, permitted purposes, delegation rules, and agent-to-agent trust relationships.

Delegation Service

Issues scoped, time-bound delegation tokens at the user-to-agent boundary. Tracks active delegations, supports revocation, and feeds the policy decision point with token context.

Policy Decision Point

Evaluates the four decision points (user-to-agent, agent-to-capability, agent-to-agent, full-chain). Policy-as-code makes decisions auditable, testable, and reusable.

Orchestration Layer

Coordinates the agent workflow. Enriches each call with identity, delegation, capability, purpose, and risk context before submitting to the policy decision point. Enforces decisions before tool, data, or sub-agent access.

Capability Registry

Catalogs the tools, APIs, retrieval domains, and sub-agents an agent may invoke. Each entry includes scope constraints, rate limits, and required preconditions.

Execution Trace Store

Immutable log of every transition in the call chain: user, agent, delegation, capability, decision outcome, data accessed, model invoked, output filtered. The substrate for audit and incident response.

Six failure modes

The patterns that turn multi-agent systems into governance disasters. Each has a corresponding control.

Privilege escalation via delegation

A user delegates a task with broader-than-intended authority because the delegation model defaults to inheriting all user permissions. Defense: delegation tokens carry minimum-necessary scope, not user-level scope.

Hidden agent-to-agent execution chains

Agent A calls Agent B calls Agent C. Each individual call is permitted, but the end-to-end chain accesses resources the original user could not. Defense: evaluate transitive scope at every transition, not just at the entry point.

Purpose-laundering

An agent invoked for one purpose calls another agent that interprets the request under a different purpose, granting access that the original purpose would have denied. Defense: purpose is set at the user delegation point and travels with the call chain; downstream agents cannot redeclare it.

Uncontrolled recursion

Agent A calls Agent B, which calls Agent A back, looping until rate limits or budget exhaustion. Defense: maximum delegation depth, recursion detection, and cycle-breaking at the policy decision point.

Tool capability sprawl

Agents accumulate tool capabilities over time, and old capabilities are never reviewed or revoked. Defense: capabilities are time-bound, reviewed quarterly, and revoked when not actively used.

Audit gaps at transitions

Logging covers individual agent calls but not the full chain. When an incident occurs, the path is unreconstructable. Defense: a single correlation ID per user request, propagated through every transition and persisted with policy decisions, data accesses, and outputs.

Mapping to governance

Agentic workflows inherit the full governance framework, with specific implications for several sections.

Risk Tiering (Section 2)

Multi-agent systems push the Automation Amplification dimension up by default. Tier 3 is the typical floor; fully automated agent chains affecting regulated outcomes are Tier 4.

Threat Modeling (Section 3)

Agent-to-agent control, unauthorized retraining, and prompt injection threats apply across the chain. The threat model must cover transitive scope, not just direct agent calls.

Monitoring (Section 5)

Watch for delegation token misuse, unexpected agent-to-agent calls, recursion depth anomalies, and purpose drift across the chain. The execution trace store is the substrate for all of these.

Incident Response (Section 6)

Containment requires being able to revoke delegations and disable agent-to-agent paths fast. Post-incident review examines whether the failure was at a missing decision point or at an undocumented capability.

Architecture