Stories by Imran Siddique on Medium

Running AI Agent Governance on AWS, No Azure Required

Imran Siddique — Tue, 14 Apr 2026 04:41:20 GMT

How to deploy Microsoft’s Agent Governance Toolkit on ECS/Fargate and govern your Bedrock agents

I’m going to say something that might surprise you: Microsoft’s best open-source security toolkit runs perfectly on AWS. No Azure subscription. No vendor lock-in. Just pip install and go.

The Agent Governance Toolkit (AGT) is an MIT-licensed runtime governance layer for AI agents. It intercepts every tool call, API request, and inter-agent message before execution — enforcing policies at sub-millisecond latency. It covers all 10 OWASP Agentic AI risks, and it works with LangChain, CrewAI, AutoGen, Bedrock agents, and anything else you’re building on AWS.

Here’s how to get it running on your AWS infrastructure in under 30 minutes.

Why You Need This

If you’re running AI agents on Bedrock, Lambda, or ECS, you probably already know the problem:

Unpredictable Tool Calls: Your agents can call tools or parameters you didn’t anticipate.
Invisible Audit Trails: There’s no deterministic record of what actions agents took and why.
Least-Privilege Enforcement: You can’t easily prove to your CISO that agents follow security best practices.
Compliance Deadlines: The EU AI Act (August 2026) requires demonstrable human oversight and risk management for high-risk AI systems.

AGT solves this by placing a deterministic safety kernel between agent “thought” and system “action.” Everything gets logged, dangerous actions get blocked, and your compliance team gets the evidence they need.

Architecture

┌─────────────────────────────────────────────┐
│               Your AWS Account              │
│                                             │
│  ┌──────────┐    ┌──────────────────────┐   │
│  │ Bedrock  │───▶│  AGT Policy Engine   │   │
│  │ Agent    │    │  (ECS/Fargate)       │   │
│  └──────────┘    │                      │   │
│                  │  ✓ Policy check      │   │
│  ┌──────────┐    │  ✓ Identity verify   │   │
│  │ Lambda   │───▶│  ✓ Audit log         │   │
│  │ Agent    │    │  ✓ Rate limit        │   │
│  └──────────┘    └──────────┬───────────┘   │
│                             │               │
│                    ┌────────▼────────┐      │
│                    │  CloudWatch /   │      │
│                    │  S3 Audit Logs  │      │
│                    └─────────────────┘      │
└─────────────────────────────────────────────┘

Zero Azure dependencies. Pure Python containers.

Step 1: The 3-Line Quick Start

Before we containerize, let’s prove it works locally:

pip install agent-os-kernel

from agent_os.lite import govern

# One line: define what's allowed and what's blocked
check = govern(
    allow=["web_search", "read_file", "query_database"],
    deny=["execute_code", "delete_file", "ssh_connect"],
)

# One line: check any agent action
check("web_search")      # ✅ Allowed
check("execute_code")    # 💥 GovernanceViolation raised
check.is_allowed("delete_file")  # False (non-raising)

That’s it. Three lines. Sub-millisecond. No complex YAML, no config files, no trust mesh. Just a fast allow/deny gate.

Step 2: Create the Dockerfile

FROM python:3.12-slim

WORKDIR /app

# Install AGT
RUN pip install --no-cache-dir agent-os-kernel[full]

# Copy your policies and agent code
COPY policies/ ./policies/
COPY app.py .

CMD ["python", "app.py"]

Step 3: Write Your Governed Agent

Here’s a real agent wrapper that works with any Bedrock model:

# app.py — Governed agent on AWS
import json
import boto3
from agent_os.lite import govern

# --- Governance setup ---
check = govern(
    allow=["invoke_model", "read_s3", "query_dynamodb", "send_sns"],
    deny=["delete_s3", "modify_iam", "execute_code", "create_ec2"],
    blocked_content=[
        r'\b\d{3}-\d{2}-\d{4}\b',  # SSN
        r'\b(?:\d[ -]*?){13,16}\b', # Credit cards
    ],
    max_calls=100,
    log=True,
)

bedrock = boto3.client("bedrock-runtime", region_name="us-east-1")

def governed_invoke(action: str, payload: dict) -> dict:
    """Every action goes through governance first."""
    # Check the action
    check(action)

    # Check the content for PII
    content = json.dumps(payload)
    if not check.is_allowed(action, content=content):
        return {"error": "Blocked: content contains sensitive data"}

    # Execute the actual action
    if action == "invoke_model":
        response = bedrock.invoke_model(
            modelId=payload["model"],
            body=json.dumps(payload["body"]),
        )
        return json.loads(response["body"].read())

    return {"error": f"Unknown action: {action}"}

# --- Your agent loop ---
if __name__ == "__main__":
    # This will work
    result = governed_invoke("invoke_model", {
        "model": "anthropic.claude-sonnet-4-20250514",
        "body": {"prompt": "Summarize Q4 earnings"}
    })
    print(f"✅ Model response received")

    # This will be blocked
    try:
        governed_invoke("delete_s3", {"bucket": "production-data"})
    except Exception as e:
        print(f"🚫 Blocked: {e}")

    # Print governance stats
    print(f"\n📊 {json.dumps(check.stats, indent=2)}")

Step 4: Deploy to ECS/Fargate

Create the ECR repository and push:

aws ecr create-repository --repository-name agt-governed-agent
aws ecr get-login-password | docker login --username AWS --password-stdin $ACCOUNT.dkr.ecr.us-east-1.amazonaws.com

docker build -t agt-governed-agent .
docker tag agt-governed-agent:latest $ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/agt-governed-agent:latest
docker push $ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/agt-governed-agent:latest

ECS Task Definition:

{
  "family": "agt-governed-agent",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "containerDefinitions": [
    {
      "name": "governed-agent",
      "image": "${ACCOUNT}.dkr.ecr.us-east-1.amazonaws.com/agt-governed-agent:latest",
      "essential": true,
      "environment": [
        {"name": "AWS_DEFAULT_REGION", "value": "us-east-1"}
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/agt-governed-agent",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "agt"
        }
      }
    }
  ],
  "executionRoleArn": "arn:aws:iam::${ACCOUNT}:role/ecsTaskExecutionRole"
}

Create the service:

aws ecs create-service \
  --cluster default \
  --service-name agt-governed-agent \
  --task-definition agt-governed-agent \
  --desired-count 1 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-xxx],securityGroups=[sg-xxx],assignPublicIp=ENABLED}"

Your governed agent is now running on AWS. Every action is policy-checked and audit-logged.

Step 5: Production Policy (Optional Upgrade)

When you outgrow the 3-line govern() call, AGT has production-ready policy files:

# Copy the enterprise policy template
cp examples/policies/production/enterprise.yaml policies/

# Or for financial services:
cp examples/policies/production/financial.yaml policies/

These include action rules, content filters (PII/PCI), escalation triggers, and retention settings — all in YAML. No OPA or Rego required unless you want it.

Step 6: Ship Audit Logs to S3

AGT’s audit trail integrates with CloudWatch. For compliance archival:

# Send governance stats to CloudWatch
import boto3

cloudwatch = boto3.client("cloudwatch")
stats = check.stats

cloudwatch.put_metric_data(
    Namespace="AGT/Governance",
    MetricData=[
        {"MetricName": "TotalDecisions", "Value": stats["total"], "Unit": "Count"},
        {"MetricName": "Denied", "Value": stats["denied"], "Unit": "Count"},
    ]
)

What You Get

FAQ

Q: Does this need Azure?

No. Zero Azure dependencies. It’s a Python package that runs anywhere you can run Python.

Q: Does it slow down my agents?

No. Policy checks take 0.003ms on average. Your Bedrock API call takes 500–2000ms. The governance overhead is invisible.

Q: Can I use this with LangChain on AWS?

Yes. AGT works with LangChain, CrewAI, AutoGen, and any other Python agent framework.

Q: What about the full AGT stack (trust mesh, SRE, etc.)?

Start with agent_os.lite for basic governance. Add the full stack when you need cryptographic identity, lifecycle management, or execution sandboxing.

Links

The Agent Governance Toolkit is MIT-licensed. Star it on GitHub if it’s useful to your team.

Observability for Non-Deterministic Systems: A Framework for AI Agent Reliability

Imran Siddique — Mon, 06 Apr 2026 15:54:22 GMT

The Observability Gap

For fifty years, the observability stack has assumed determinism. Prometheus scrapes CPU utilization. Jaeger traces request latency. PagerDuty fires when error rates exceed thresholds. The mental model is mechanical: if the database is slow, queries are slow; if the server crashes, requests fail. The “Three Pillars”: metrics, logs, traces, capture the behavior of infrastructure.

This model works because deterministic systems have a knowable correct state. A 200 OK is correct. A 500 is not. The boundaries are crisp, and deviations are bugs.

Why AI Agents Break This Model

AI agents introduce properties that deterministic observability cannot capture:

Non-determinism: The same prompt produces different outputs on successive calls. Traditional monitoring treats variance as noise; in agent systems, variance is the signal.
Semantic correctness: A 200 OK with a hallucinated answer is worse than a 500 error. HTTP status codes carry zero information about output quality. An agent that confidently produces wrong code or wrong medical advice is more dangerous than one that crashes.
Progressive degradation: As context windows fill, LLM output quality degrades gradually, responses get shorter, less accurate, and more repetitive. This is Context Rot. There is no error. There is no crash. There is only a slow rot that traditional monitoring cannot see.

The “Laziness” Problem

During my work observing coding agents at scale, I discovered a failure mode that no existing observability tool detected: agent laziness. The agent would produce syntactically valid but substantively empty responses, placeholder functions, TODO comments instead of implementations, or responses that technically answered the question while doing as little work as possible.

This is not a hallucination. It is not an error. It is a quality degradation that only becomes visible when you measure the gap between what was asked and what was delivered. This discovery led to the development of a Laziness Index that measures response length shrinkage, placeholder patterns, and delegation frequency. The metrics we were building for coding agents were actually capturing fundamental properties of human-agent collaboration.

The Framework: Behavioral Observability

Observability for non-deterministic systems requires a shift from infrastructure metrics to behavioral metrics. We must measure not what the system is doing (CPU, memory, latency) but what the system is achieving (correct outcomes, user satisfaction, progressive quality).

Core Principle: The Human as Sensor

In human-agent collaboration, the human’s behavior is the most reliable signal of agent quality. When a developer says “that’s wrong, fix it,” they are providing a ground-truth quality signal that no automated evaluation can match.

This is Correction-Based Observability: the systematic detection and scoring of human corrections to agent outputs as a proxy for output quality.

Seven Behavioral Metrics

Hallucination Index: Rate of human corrections to agent outputs.
Laziness Index: Response quality degradation and effort avoidance.
Context Rot Index: Quality degradation over session length.
Flow Score: Consecutive productive interactions without correction.
Loop Rate: Consecutive correction cycles without progress.
Session Health: Three-tier classification (Clean, Bumpy, Troubled).
Cost Per Outcome: Token spend divided by tangible deliverables.

Application Across High-Stakes Domains

The correction-based observability pattern is universally applicable to any human-agent collaboration where the human can signal dissatisfaction.

Healthcare: Clinical Decision Support

A hallucinating coding agent produces a bug. A hallucinating clinical agent produces a misdiagnosis. In this domain, the framework uses tighter thresholds. A 15% hallucination rate in coding is a productivity issue; in healthcare, the threshold for a “Troubled” session is often a single override.

Energy and Grid Management

In energy systems, the consequences of errors manifest in physical system behavior. The observability layer tracks Physical Constraint Violation Rates — recommendations that violate thermal limits or voltage bounds — which are physically impossible “hallucinations.”

Financial Services and Legal

Finance: Measuring the Fair Lending Deviation Index to track if underwriter overrides vary by borrower demographics.
Legal: Monitoring the Citation Hallucination Index to detect non-existent case law before it reaches a court filing.

Architecture: Privacy-First and Local-First

Behavioral observability data is sensitive. A correction log reveals what an expert (a doctor, an attorney, an engineer) had to fix.

The proposed architecture for this framework is local-first: all computation happens on the practitioner’s machine. No raw sessions or corrections leave the device. Team-level aggregation uses anonymized identities and transmits only aggregate metrics. This removes the primary barrier to AI observability: the fear that the tool will expose individual performance rather than system reliability.

Conclusion

Non-deterministic observability is not a product; it is a discipline. As we scale agentic architectures, we must stop measuring what agents consume (tokens, latency) and start measuring what they achieve.

While the foundational primitives for agent tracking exist in the open-source Agent Governance Toolkit (AGT), this behavioral framework represents a necessary evolution in how we ensure AI reliability at scale.

Securing AI agents with agent governance

Imran Siddique — Thu, 26 Mar 2026 07:16:00 GMT

Photo by Tekton on Unsplash.

Imagine this scenario: An AI agent is asked to “clean up old records,” and it interprets “old” as “everything older than today.” There is no policy engine to intercept the action, no approval workflow to pause and ask a human, and no kill switch to stop it mid-execution. The agent has been given unrestricted tool access — the equivalent of handing a new employee the root password on their first day and saying, “figure it out.”

This hypothetical illustrates a real and growing concern. As AI agents have evolved from simple chatbots into autonomous systems that book flights, execute trades, write code, and manage infrastructure, a gap has emerged: Many of the popular frameworks that power these agents focus on orchestration and have not yet built in runtime security governance. Frameworks like LangChain, AutoGen, and CrewAI do an excellent job of orchestrating agent behavior, but the industry as a whole is still developing answers to a fundamental question: What happens when an agent does something it shouldn’t?

Note: The Agent Governance Toolkit is currently available as a community preview release. All packages published to PyPI and npm are not official Microsoft-signed releases. Official signed packages via ESRP Release will be available in a future release. All security policy rules and detection patterns ship as configurable sample configurations that users must review and customize before production use.

That question sent me down a path that eventually became the Agent Governance Toolkit — an open-source framework, now released by Microsoft, that brings operating system-level security concepts to the world of AI agents. In this article, I walk through the problem we set out to solve, the architectural decisions that shaped our approach, and the technical details of how we built a system that enforces policy, verifies identity, isolates execution, and engineers reliability for autonomous AI agents.

The problem: AI agents operate in a security vacuum

To understand why agent governance matters, consider how a typical AI agent works today. A developer writes a prompt, connects a set of tools (database access, web browsing, file system operations), and hands control to an LLM. The agent reasons about what to do, selects tools, and executes actions — often in a loop, sometimes spawning sub-agents to handle subtasks.

The challenge is that in many current implementations, agent actions are unmediated. When an agent calls a tool, there is typically no security layer checking whether that call is within policy. There is often no identity verification when one agent communicates with another. There may be no resource limit preventing an agent from making 10,000 API calls in a minute. And there is frequently no circuit breaker to stop a failing agent from cascading failures across a system.

In February 2026, OWASP published the Agentic AI Top 10 (see owasp.org/www-project-agentic-ai-threats), the first formal taxonomy of risks specific to autonomous AI agents. The list highlights serious concerns for anyone running agents in production: goal hijacking through prompt injection, tool misuse, identity abuse, memory poisoning, cascading failures, and rogue agents. My team realized that addressing these risks required more than a guardrail library. It required a fundamentally new abstraction layer.

The insight: What if we treated AI agents like processes?

The key insight came from an analogy that now seems obvious in hindsight, because operating systems solved a similar problem decades ago.

In the 1970s, when multi-user computing emerged, engineers faced a similar challenge: multiple untrusted programs sharing resources on a single machine. The solution they developed was the OS kernel — a privileged layer that mediates every interaction between a process and the outside world. Processes can’t directly access hardware; they make syscalls. They can’t read each other’s memory; they have isolated address spaces. They can’t consume unlimited resources; the scheduler enforces quotas.

So, we asked ourselves: What would an “operating system for AI agents” look like?

The answer became the four-layer architecture of the Agent Governance Toolkit:

Agent OS: The kernel. Every agent action passes through a policy engine before execution, just as every process action passes through the OS kernel via syscalls.
AgentMesh: The identity layer. Agents have cryptographic identities (DIDs with Ed25519 key pairs) and must verify each other before communicating, similar to how mTLS works in service meshes.
Agent Runtime: The isolation layer. Agents are assigned to execution rings based on their trust scores, with resource limits enforced per ring — inspired by CPU privilege rings.
Agent SRE: The reliability layer. SLOs, error budgets, circuit breakers, and chaos testing — all the practices that keep distributed services reliable, applied to agent systems.

Under the hood: How policy enforcement actually works

Let me show you what runtime policy enforcement looks like in practice, because it’s the piece that distinguishes this toolkit from existing approaches.

Most “guardrail” systems work by filtering inputs or outputs — they check the prompt before the LLM sees it, or they scan the response after the LLM generates it. The problem is that agent actions happen between those two points. An agent might receive a perfectly safe prompt, reason correctly about it, and then call a tool in a way that violates policy. Input/output filtering misses this entirely.

Agent OS intercepts at the action layer. When an agent calls a tool, the call passes through the policy engine before reaching the tool:

from agent_os import StatelessKernel, ExecutionContext, Policy

kernel = StatelessKernel()

# Define what this agent is allowed to do
ctx = ExecutionContext(
    agent_id="analyst-001",
    policies=[
        Policy.read_only(),                    # Default: no writes
        Policy.rate_limit(100, "1m"),          # Max 100 calls/minute
        Policy.require_approval(
            actions=["delete_*", "write_production_*"],
            min_approvals=2,
            approval_timeout_minutes=30,
        ),
    ],
)

# This call gets intercepted by the policy engine
result = await kernel.execute(
    action="delete_user_record",
    params={"user_id": 12345},
    context=ctx,
)
# result.signal == "ESCALATE" → approval workflow initiated

The key design decision here was to make the kernel stateless. Each request carries its own context — policies, history, identity — rather than storing state in the kernel. We chose this pattern because it enables horizontal scaling: You can run the kernel behind a load balancer, in a serverless function, or as a sidecar container, with no shared state to manage.

The policy engine itself has two layers. The first is configurable pattern matching with sample rule sets for detecting dangerous strings like “ignore previous instructions” or SQL injection patterns. The second is a semantic intent classifier that detects dangerous goals even when the exact phrasing does not match a pattern. When an agent’s action is classified as DESTRUCTIVE_DATA, DATA_EXFILTRATION, or PRIVILEGE_ESCALATION, the policy engine flags it for intervention regardless of how the request was worded.

Zero-trust identity: TLS for AI agents

When we started looking at multi-agent systems — scenarios where multiple agents collaborate on a task — the identity challenge became clear. In many frameworks, agents communicate as simple function calls. Agent A calls Agent B, and Agent B processes whatever it receives because identity verification has not yet been a standard feature of agent communication protocols.

AgentMesh introduces a protocol we call IATP — the Inter-Agent Trust Protocol. Think of it as TLS for AI agents: encryption, authentication, and authorization in one handshake.

Every agent gets a cryptographic DID (Decentralized Identifier) backed by an Ed25519 key pair:

from agentmesh import AgentIdentity, TrustBridge

# Create identity with a human sponsor for accountability
identity = AgentIdentity.create(
    name="data-analyst",
    sponsor="alice@company.com",
    capabilities=["read:data", "write:reports"],
)
# identity.did → "did:mesh:data-analyst:a7f3b2..."

# Before communicating, verify the peer
bridge = TrustBridge()
verification = await bridge.verify_peer(
    peer_id="did:mesh:other-agent",
    required_trust_score=700,  # Must score ≥700/1000
)

if verification.verified:
    await bridge.send_message(peer_id, encrypted_message)

One design choice that proved critical was trust decay. An agent’s trust score isn’t static — it decays over time without positive signals. An agent that was trusted yesterday but has been silent for a week gradually becomes untrusted. This models reality: In the physical world, trust requires ongoing demonstration of good behavior, and our system reflects that.

Delegation chains solve another real-world problem: When an orchestrator agent delegates a task to a worker agent, the worker should have only the permissions needed for that specific task. AgentMesh enforces scope narrowing — a parent with read and write capabilities can delegate only read access to a child, and that child cannot re-delegate broader permissions than it received.

Execution rings: Hardware security concepts for software agents

The Agent Runtime borrows from CPU architecture. Intel processors have privilege rings (Ring 0 for the kernel, Ring 3 for user processes) that prevent unprivileged code from accessing protected resources. We applied the same concept to agents, but with a twist: Ring assignment is dynamic, based on behavioral trust scores.

Ring 0 (Privileged): Trust score ≥ 0.95. Can modify system policies. Reserved for human-verified orchestrators.
Ring 1 (Trusted): Trust score ≥ 0.80. Standard operations with full tool access.
Ring 2 (Standard): Trust score ≥ 0.60. Limited resource access, rate-limited.
Ring 3 (Sandbox): Trust score < 0.60. Heavily restricted. New or untrusted agents start here.

Each ring enforces resource limits: maximum execution time per step, memory caps, CPU throttling, and request rate limits. An agent in Ring 3 might be limited to 10 API calls per minute with a five-second execution timeout, while a Ring 0 agent has no such restrictions.

The runtime also provides saga orchestration for multi-step operations. When an agent executes a sequence of actions — draft an email, send it, update the CRM — and the final step fails, the saga engine automatically calls compensating actions in reverse order. The email gets recalled, the draft gets deleted. This pattern, borrowed from distributed transaction processing, prevents the partial-completion failures that plague agentic workflows.

Reliability engineering for agents

When we built the Agent SRE package, we started with a question: How do you define “reliable” for an AI agent? Traditional SRE metrics like uptime and latency matter, but agents introduce new dimensions. An agent might be fast and available but produce incorrect results. It might be accurate but cost $500 per hour in API calls. It might work perfectly in isolation but cause cascading failures when it interacts with other agents.

We defined seven Service Level Indicators (SLIs) specific to AI agents: correctness, safety, latency, cost, availability, throughput, and delegation success rate. Each SLI gets a threshold, and together they form an error budget — a quantified tolerance for failure.

Here’s where it gets interesting: The error budget drives automated remediation. When an agent’s safety SLI drops below 99 percent (meaning more than 1 percent of its actions violate policy), the system can automatically trigger a kill switch, downgrade the agent’s execution ring, or activate a circuit breaker that rejects new requests until the agent recovers.

We also built nine chaos engineering fault injection templates — network delays, LLM provider failures, tool timeouts, trust score manipulation, memory corruption, concurrent access races — because the only way to know if your agent system is resilient is to break it on purpose in controlled conditions.

Covering the OWASP Agentic AI Top 10

When OWASP published their Agentic AI Top 10, we mapped each risk to our toolkit’s capabilities and found that the architecture provides mitigations for all ten categories:

Goal hijacking is addressed by the policy engine’s semantic intent classifier.
Tool misuse is mitigated by capability sandboxing and the MCP proxy.
Identity abuse is addressed by DID-based identity and trust scoring.
Supply chain risks are tracked by AI-BOM v2.0, which records model provenance, dataset lineage, and weight versioning.
Code execution is constrained by execution rings and resource limits.
Memory poisoning is detected by the Cross-Model Verification Kernel, which runs claims through multiple LLMs and uses majority voting to identify manipulation.
Insecure communications are mitigated by the IATP protocol’s encryption layer.
Cascading failures are addressed by circuit breakers and SLO enforcement.
Human-agent trust exploitation is mitigated by approval workflows with quorum logic.
Rogue agents are addressed by ring isolation, behavioral trust decay, and the kill switch.

This alignment was by design, not by accident. The OS-inspired architecture creates defense in depth — multiple independent layers that each address different threat categories. No security system can guarantee absolute protection, but by layering complementary defenses, the toolkit significantly reduces the attack surface for autonomous AI agents.

The interoperability challenge

A governance toolkit is only useful if it works with the frameworks people actually use. We designed the toolkit to be framework-agnostic, with adapters that interoperate with LangChain, CrewAI, Google ADK, AutoGen, LlamaIndex, and others. Each adapter hooks into the framework’s native extension points — LangChain’s callback handlers, CrewAI’s task decorators, Google ADK’s plugin system — so that adding governance does not require rewriting existing agent code.

Several of these adapters are already working with production frameworks: Dify (65K+ GitHub stars) has the governance plugin in its marketplace, LlamaIndex (47K+ stars) has a TrustedAgentWorker, and proposals are active for AutoGen, CrewAI, Google ADK, and Haystack.

What we learned

Building this toolkit reinforced several lessons that apply beyond agent governance:

Borrow from solved problems. The OS kernel, service mesh, and SRE playbook all addressed security and reliability challenges in other domains. Translating those patterns to AI agents was more effective than inventing from scratch.

Make security the default, not an add-on. The reason we built governance into the execution path (intercepting actions) rather than as an optional wrapper is that optional security tends to go unadopted. If adding governance requires changing agent code, many teams will defer it. That said, no security layer is a silver bullet — defense in depth and ongoing monitoring remain essential.

Trust is dynamic, not static. A binary trusted/untrusted model doesn’t capture reality. Trust scoring with behavioral decay and ring-based privilege assignment turned out to be a much better model for systems where agents are constantly changing.

Statelessness enables everything. By making the kernel stateless, we got horizontal scaling, containerized deployment, and perfect auditability for free. Every decision we agonized over early in the architecture became easier once we committed to statelessness.

Getting started

The Agent Governance Toolkit is now open source under the MIT license at github.com/microsoft/agent-governance-toolkit. You can install it with a single command:

pip install ai-agent-compliance[full]

This installs all four packages — Agent OS, AgentMesh, Agent Runtime, and Agent SRE — with version compatibility guaranteed. Individual packages are also available for teams that want to adopt governance incrementally.

The toolkit runs at sub-millisecond governance latency (< 0.1ms p99), so it adds negligible overhead to agent execution. It exports metrics to OpenTelemetry-compatible platforms (Datadog, Prometheus, Grafana, Arize, Langfuse), and it works with Python 3.10+.

AI agents are becoming autonomous decision-makers in high-stakes domains — finance, healthcare, infrastructure, security. The question is not whether we need governance for these systems, but whether we will build it proactively, before incidents occur, or reactively, after them. We’ve chosen to be proactive. We hope you join us.

Imran Siddique is on LinkedIn.

The Agent Governance Toolkit is open source under the MIT license. Contributions welcome at github.com/microsoft/agent-governance-toolkit.

The author used AI-assisted tools during the drafting of this article. All technical content, code examples, and architectural descriptions reflect the actual capabilities of the Agent Governance Toolkit and have been reviewed for accuracy.

Securing AI agents with agent governance was originally published in Data Science + AI at Microsoft on Medium, where people are continuing the conversation by highlighting and responding to this story.

OpenShell + Governance Toolkit: Engineering the Complete Agent Security Stack

Imran Siddique — Wed, 18 Mar 2026 06:55:31 GMT

Recent vulnerability disclosures by Cisco highlighted data exfiltration risks in third-party OpenClaw skills, reminding us that prompt injection remains a critical threat to multi-agent systems.

Solving this requires defense in depth. NVIDIA recently announced OpenShell at GTC — an open-source sandboxed runtime for AI agents. OpenShell provides incredible capabilities for filesystem, network, process, and inference controls.

However, runtime isolation alone cannot discern intent or trust. That is where the Agent Governance Toolkit comes in. By running the Governance Toolkit inside (or alongside) an OpenShell sandbox, we combine governance intelligence with strict execution boundaries.

We call this the “Walls + Brain” Architecture.

The Capability Matrix: Walls vs. Brain

OpenShell and the Agent Governance Toolkit solve fundamentally different halves of the agent security problem. They do not compete; they stack.

OpenShell (The Walls) provides:

Container isolation (Docker/K3s)
Filesystem read/write policies
Network egress control (L7 proxy)
Process and syscall restrictions

The Governance Toolkit (The Brain) provides:

Agent Identity (Ed25519 cryptographic DIDs)
Behavioral trust scoring (5-dimension, 0–1000 scale)
Deterministic policy engines (YAML, OPA/Rego, Cedar)
Authority resolution and reputation-gated delegation
Tamper-evident Merkle audit chains

OpenShell evaluates the environment: “Is this network call allowed by the sandbox policy?” The Governance Toolkit evaluates the actor: “Should this specific agent be trusted to make this call at all?”

The Request Flow: Defense in Depth

In this integrated architecture, a single agent action (e.g., executing curl or writing to a file) must pass through two independent policy layers.

Layer 1: The Governance Gate

Before compute is initiated, the Governance sidecar evaluates the request. It verifies the agent’s cryptographic identity, checks if the dynamic trust score is above the required threshold, resolves delegated authority, and evaluates the intent against the declarative YAML/OPA policy. If approved, it commits the decision to the Merkle audit chain.

Layer 2: The OpenShell Sandbox

Once governance approves the action, OpenShell enforces the physical runtime constraints. It ensures the process does not violate syscall restrictions and that the network egress proxy allows the specific host connection.

If either layer denies the action, execution is hard-blocked.

Policy Layering in Action

Here is what this looks like when an agent attempts to POST to a cloud metadata endpoint (169.254.169.254/metadata) due to a hallucination or prompt injection:

Layer 1 (Governance): The policy engine evaluates the request context. The policy explicitly blocks http:*:169.254.169.254/*.
Result: The action is deterministically DENIED and logged with the violation reason. The agent's trust score is slashed.
Execution: The payload never reaches the OpenShell runtime.

Conversely, if the agent makes a legitimate request to api.github.com/repos/org/repo/issues:

Layer 1 (Governance): Identity verified (did:mesh:a1b2c3). Trust score is 0.82 (above the 0.5 threshold). Authority is confirmed as delegated by the parent agent. → ALLOW.
Layer 2 (OpenShell): The network policy explicitly permits outbound POST traffic to api.github.com. The process policy permits the curl binary. → ALLOW.
Execution: The action safely executes.

Deployment Topologies

The integration supports flexible deployment models depending on your infrastructure:

Option A (Governance Skill): The toolkit is installed as an OpenClaw skill inside the sandbox. The agent natively invokes the validation scripts (check-policy.sh, verify-identity.sh) before taking action. (Note: We have just updated all 6 OpenClaw scripts to v1.1.0 to support the latest AgentMesh API).
Option B (Governance Sidecar): For production, the toolkit runs as a sidecar proxy intercepting all tool calls on port 8081. OpenShell’s network policies are configured to block all outbound traffic except to the governance sidecar and approved LLM endpoints.

Unified Observability

Running both layers generates two complementary telemetry streams. OpenShell emits physical logs (network egress, filesystem access, process execution), while the Governance sidecar emits behavioral metrics (policy_decisions_total, trust_score_current, authority_resolutions_total).

Because the toolkit natively exports via Prometheus/OpenTelemetry, both streams can be fed into a single Grafana dashboard, allowing Site Reliability Engineers to monitor both the physical sandbox and the agent’s trust economy simultaneously.

Getting Started

We have published the full architecture, sidecar setup options, and policy layering examples in our new integration guide (docs/integrations/openshell.md).

Running 11 AI Agents in Production: How the Agent Governance Toolkit Secures Our Workflows

Imran Siddique — Thu, 12 Mar 2026 20:34:16 GMT

Everyone is building AI agents. Frameworks like LangChain, AutoGen, CrewAI, and the OpenAI Agents SDK are everywhere. But after deploying my first multi-agent system, I noticed a fundamental architectural gap: none of these frameworks answer the hard question.

What strictly stops an agent from doing something it shouldn’t?

I run the AI Native Team inside Microsoft. We build and ship AI-first tooling across our pipelines: code review, security scanning, spec drafting, test generation, and infrastructure validation. At any given moment, 11 specialized agents are running concurrently against our production repositories, making real decisions about real code.

That is 11 autonomous agents with access to tools, files, and APIs. Without governance, that is 11 distinct attack surfaces.

Enter the Agent Governance Toolkit

The Agent Governance Toolkit (microsoft/agent-governance-toolkit) is an open-source middleware layer that sits between your agents and their execution environments. It is not another agent framework—it is a security kernel.

Every tool call, output, and agent-to-agent interaction passes through a deterministic policy engine before it executes.

Here is what the execution pipeline looks like:

Agent Request → Trust Check → Governance Gate → Reliability Gate → Execute → Output Check → Audit Log

The key engineering insight here is that safety decisions must be deterministic, not prompt-based. The policy engine uses strict pattern matching, capability models, and budget tracking. There is no LLM involved in the safety layer, meaning zero hallucination risk and sub-millisecond enforcement.

Real Numbers from a Production Instance

Here is the telemetry from our production daemon, recorded over an 11-day continuous uptime window:

Those 473 denials represent 473 times an agent tried to execute an unauthorized action and was hard-blocked. This includes token budget overflows, destructive shell patterns (rm -rf), SQL injection patterns (DROP TABLE), and tool call limit violations. Every single incident was caught deterministically and logged in under 8 milliseconds.

The Architectural Flaw in Prompt-Based Governance

When we evaluated our governance options, we looked heavily at prompt-based approaches like OpenClaw.

The fundamental problem with prompt-based governance is the recursive trust issue: You are using an LLM to decide whether an LLM should be allowed to do something. Here is how deterministic kernel-level governance compares to prompt-based safety:

The latency difference alone dictates the architecture. Evaluating 7,000+ decisions across 11 agents with a 500ms LLM penalty would add nearly an hour of pure overhead. Our deterministic approach added exactly 0.43 seconds of total overhead across the entire 11 days.

Snapshots: Governance in Action

Because the governance is deterministic, the telemetry is incredibly clear. Here is what a live, healthy session looks like in our logs:

2026-03-11 21:43:01 [GOVERNANCE] security-scanner → execute_task → ALLOW (0.377ms)
2026-03-11 21:43:34 [GOVERNANCE] code-reviewer → output_check → ALLOW (0.442ms)
2026-03-11 22:19:43 [GOVERNANCE] spec-drafter → execute_task → ALLOW (3.970ms)

And here is what happens when a boundary is hit:

2026-03-08 14:22:11 [GOVERNANCE] agent-42 → execute_task → DENY: Blocked pattern: rm -rf (0.12ms)
2026-03-09 09:15:33 [GOVERNANCE] researcher → execute_task → DENY: Token budget exceeded: 200/100 (0.08ms)
2026-03-10 16:44:02 [GOVERNANCE] agent-17 → execute_task → DENY: Tool call limit exceeded: 10/5 (0.05ms)

Configuration as Code

We do not run separate infrastructure for this. The entire governance policy fits in a YAML block inside our daemon config:

governance:
  enabled: true
  max_tokens_per_task: 8000
  max_tool_calls_per_task: 20
  max_files_changed: 15
  blocked_patterns:
    - "rm -rf /"
    - "DROP TABLE"
    - "DELETE FROM"
  policy_mode: strict   # strict | permissive | audit

We utilize strict mode in production to hard-block violations, and audit mode in development to tune policies by logging intent without halting execution.

The Three-Gate Architecture

Robust infrastructure requires defense in depth. Governance here is not a single if/else statement; it is three independent execution gates:

GovernanceGate (Policy): Enforces blocked patterns, token budgets, and scope guards using the Agent-OS kernel.
TrustGate (Identity): Each agent earns or loses trust based on compliance history. Built on AgentMesh’s 0–1000 trust scale, misbehaving agents are mathematically demoted.
ReliabilityGate (SRE): Circuit breakers and SLO enforcement. If an agent’s error rate spikes, the circuit breaker trips and blocks further execution, powered by Agent SRE.

All three gates must pass. A highly trusted agent can still be denied by a policy limit. A policy-compliant agent can still be blocked by a tripped circuit breaker.

The Engineering Impact

The feeling of running with a deterministic safety net is profound. It changes how you build.

We ship faster. With strict guardrails, we trust agents to operate with far more autonomy.
We sleep better. Our daemon runs 24/7. The audit log tells us exactly what happened, when, and why. There are no black boxes.
Compliance by default. We have deterministic coverage for the OWASP Agentic Top 10. When security review asks how we govern our agents, we simply hand them the YAML config and the audit logs.

It is the exact difference between driving a mountain road without guardrails, and driving it with them. You can still drive fast; you just can’t drive off the cliff.

Try It Yourself

If you are running agents in production, wrap them in a safety kernel.

pip install ai-agent-compliance[full]

It takes one install to get the full governance stack. Wrap your existing agents — whether built on LangChain, AutoGen, CrewAI, or Swarm — and every action will route through the policy engine.

The Agent Governance Toolkit is open-source (MIT licensed) and available here: github.com/microsoft/agent-governance-toolkit.

Engineering the Agent Hypervisor: OS Primitives for Multi-Agent Systems

Imran Siddique — Tue, 03 Mar 2026 00:53:02 GMT

Most of the discussion around “AI Safety” focuses on the model: red-teaming, alignment, and prompt injection. But as we build systems where dozens of autonomous agents interact, the problem shifts from model safety to system architecture.

In a multi-agent architecture, agents are effectively distributed microservices. However, unlike traditional microservices, which are governed by service meshes, mTLS, and strict IAM policies, agents currently operate in a state of implicit trust. If the “Summarizer Agent” receives a payload from the “Database Agent,” it blindly executes it.

To solve this, we cannot just add more system prompts. We need an operating system layer. Today, we are releasing the Agent Hypervisor within Agent-OS: a runtime supervisor that enforces strict execution boundaries for interacting agents.

Here is a technical breakdown of the core modules we implemented.

1. Execution Rings (hypervisor.rings)

Drawing inspiration from x86 protection rings, the hypervisor implements strict privilege separation for agents.

Ring 0 (Kernel): Reserved for highly trusted agents interacting with critical infrastructure (e.g., modifying IAM policies, executing raw SQL).
Ring 3 (User Space): Reserved for public-facing or third-party agents.

2. Joint Liability and Vouching (hypervisor.liability)

In a chain of agents, blame assignment is notoriously difficult. If Agent C executes a destructive action based on data from Agent A, who is penalized?

We introduced a cryptographic “Vouching” mechanism. When agents hand off tasks, they must sign the payload, accepting a degree of joint liability. If an anomaly is detected downstream, the slashing module automatically degrades the trust score of every agent in the vouching chain. This forces multi-agent systems into a state of defensive verification-agents will refuse payloads from peers with low trust scores.

3. Distributed Rollbacks via Sagas (hypervisor.saga)

When a multi-step agent workflow fails, you cannot simply drop the connection. State has likely been mutated.

Rather than relying on the LLM to figure out how to undo its mistakes, the Hypervisor implements the Saga pattern. It maintains an append-only state_machine of all side-effects. If an execution graph fails, the orchestrator steps in and sequentially triggers predefined compensating transactions (via the reversibility.registry) to restore the system to a clean state.

4. Shared Session Context (hypervisor.session)

Passing context windows between multiple agents is expensive and insecure. We implemented a Multi-Agent SSO (Single Sign-On). Agents join a verified “Session.” The hypervisor manages the shared memory and state commitments centrally, drastically reducing token overhead while maintaining a forensic, append-only audit trail (audit.commitment and audit.delta).

Performance Constraints

Adding a governance layer is useless if it creates an unacceptable bottleneck. The Hypervisor was written to be as close to the metal as possible in Python. According to our latest benchmark suite (bench_hypervisor.py), core ring computations execute at a mean latency of 0.3μs. It secures the execution graph without impacting the critical path of the application.

The Path Forward

We are moving past the era of “prompt engineering for safety” and into the era of Agentic Systems Engineering. By treating agents as untrusted compute nodes that require an OS-level hypervisor, we can build enterprise systems that fail gracefully and deterministically.

The complete implementation, along with the integrations for our CMVK (Cross-Model Verification Kernel) and IATP adapters, is available in the Agent-OS repository.

Originally published at https://www.linkedin.com.

The Architect’s Dilemma: Skills, Agents, or an Operating System?

Imran Siddique — Tue, 03 Mar 2026 00:48:25 GMT

In the rush to “agentize” everything, we’ve hit a structural wall. Most enterprise AI today is just a collection of “skills”, fancy prompt-wrappers that execute narrow tasks. On the other end, we have the “autonomous agents” promised by research papers that often hallucinate their way into a loop.

As someone who has spent years architecting systems, from heavy-duty backends to the front lines of the AI agent revolution, I’ve seen this pattern before. It’s the classic evolution from Scripts to Microservices to Cloud OS.

Below is a deep architectural comparison of the three paradigms: Skills-Based Execution, A2A (Agent-to-Agent) Protocols, and the emerging Agent OS/Mesh layer.

1. Skills: The “Monolithic Wrapper”

Most “agents” built today are actually Skill-based. You have one brain (the LLM) and a tool-belt.

Where it makes sense: When the task is linear. If you need to “Summarize this PDF and email it,” you don’t need a multi-agent swarm. You need a skill.
Where it fails: Context bloat. As you add more skills, the prompt grows, the token cost spikes, and the model starts getting “confused” by the sheer number of tool definitions.

2. A2A (Agent-to-Agent): The “Microservices” of AI

This is where decentralized communication protocols sit. It treats agents as independent actors that pass messages.

Where it makes sense: Specialized domains. You have a “SQL Agent,” a “Legal Agent,” and a “Creative Agent.” They talk to each other to solve a complex problem.
The Problem: It’s messy. Without a centralized “OS,” these agents often get into infinite loops or lose track of the “source of truth.” It’s like running 50 microservices without Kubernetes.
The Path Forward: We need standardized communication to make this work at scale. I’ve proposed a unified agent communication layer to bridge this gap, ensuring that as agents talk, they do so within a governed framework.

3. Agent OS & Agent Mesh: The “Scale by Subtraction” Path

This is the perspective I’ve been building toward with the Agent OS and Agent Mesh projects. Instead of just adding more agents or better prompts, we introduce a Kernel.

The Agent OS (Kernel): It acts as the “Control Plane.” It enforces deterministic governance. If an agent tries to execute a “reversible” vs. “irreversible” action (like deleting a database), the Kernel intercepts it based on a Trust Protocol.
The Agent Mesh: Think of this as a Service Mesh for AI. It uses a “Sidecar” pattern. The agent doesn’t need to know how to handle security or long-term memory; the Mesh handles that via the sidecar, leaving the agent to focus purely on the logic.
Integration Strategy: To make this a reality, we must integrate these orchestration capabilities directly into the data and framework layers. For example, my work on introducing orchestration primitives in LlamaIndex is designed to give the “Data Layer” the “OS” capabilities it needs to manage complex agent states.

Scenarios: When to Use What?

Scenario A: The Executive Assistant (Skills)

You need to book meetings and read emails.

Verdict: Skills. Keep it simple. A single agent with access to Outlook and Calendar APIs is faster and cheaper.

Scenario B: Cross-Department Procurement (A2A)

Legal needs to review a contract, Finance needs to check the budget, and Procurement needs to cut the PO.

Verdict: A2A. These are distinct roles with different data permissions. A2A protocols allow them to hand off the “baton” without sharing the entire context window.

Scenario C: Autonomous Cloud Operations (Agent OS/Mesh)

A system that monitors telemetry and automatically scales or patches services in production.

Verdict: Agent OS. You cannot leave this to probabilistic LLM logic. You need a Self-Correcting Kernel that verifies every action against a verification primitive before it hits the infrastructure.

The Convergence: Will Agent OS eat A2A and LlamaIndex?

In the long run, I believe Agent OS and Agent Mesh will eventually converge with frameworks like LlamaIndex and A2A protocols.

Right now, LlamaIndex is the “Data Layer” and A2A is the “Transport Layer.” But they are missing the Operating System Layer. Eventually, LlamaIndex will become a specialized “Storage Driver” within an Agent OS, and A2A will be the “Network Protocol” that the Agent Mesh uses to route traffic.

We are moving away from “Agent as a Program” toward “Agent as a Process.” And just like any process, it needs an OS to manage its memory, its permissions, and its life cycle.

Key Takeaway: Stop building bigger agents. Start building a better kernel. Scale by subtraction, remove the coordination logic from the agent and move it into the Mesh.

Originally published at https://www.linkedin.com.

Engineering Safety: A Layered Governance Architecture for GitHub

Imran Siddique — Thu, 19 Feb 2026 05:52:39 GMT

Building safe AI agents requires more than just a good system prompt. It requires infrastructure that enforces constraints at every stage of the development lifecycle.

This week, we merged three contributions into the github/awesome-copilot repository (#755, #756, #757). Together, they implement a layered governance architecture designed to help developers build secure agentic workflows by default.

Here is the technical breakdown of the implementation.

Layer 1: Pre-Computation Safety (The Hook)

Component: governance-audit

We implemented a client-side hook that intercepts userPromptSubmitted events. This is a shell-based scanner that analyzes prompts against a regex library of known threat signatures before the request leaves the developer's machine.

Threat Categorization: We classify signals into 5 buckets: data_exfiltration ("curl -d"), privilege_escalation ("chmod 777"), system_destruction ("rm -rf"), prompt_injection, and credential_exposure.
Local Execution: Privacy was a strict constraint. The scanning logic (audit-prompt.sh) runs entirely locally, ensuring no prompt data is sent to a third-party logger.
Configurable Severity: The hook supports four governance levels (open, standard, strict, locked), allowing teams to balance friction vs. safety.

This layer prevents “accidental” unsafe code generation by catching intent before it reaches the model.

Layer 2: In-Context Pattern Matching (The Skill)

Component: agent-governance

To generate secure code, the model needs to understand valid security patterns. We added a skill definition that injects specific governance context into Copilot's retrieval path.

Key patterns covered:

Policy-as-Code: Standardizing on declarative YAML for allowlists/blocklists rather than hardcoding logic.
Trust Scoring: Implementing decay-based trust models for multi-agent delegation. (e.g., If Agent A fails a task, its score degrades; if it succeeds, it increments).
Auditability: Enforcing append-only logging for all tool invocations.

By formalizing these as a “Skill,” we ensure Copilot retrieves high-quality examples for PydanticAI and CrewAI rather than hallucinating insecure implementations.

Layer 3: Post-Generation Verification (The Agent)

Component: agent-governance-reviewer

The final layer is verification. We introduced a specialized Copilot agent (agents/agent-governance-reviewer.agent.md) configured to act as a security linter.

Unlike a standard linter, this agent reviews semantic safety:

Decorator Audits: Checks if sensitive tools are wrapped with the @govern decorator.
Secret Detection: Scans for hardcoded secrets in agent configuration blocks.
Trust Boundary Analysis: Verifies that multi-agent handoffs include explicit identity verification steps.

Conclusion

This work represents a shift from “ad-hoc” safety to structural safety. By embedding these patterns directly into the developer’s IDE via the awesome-copilot standard, we reduce the friction of implementing robust governance.

This aligns with our broader work on Agent-OS, creating a standardized control plane for autonomous systems.

Why Your AI Agents Need Passports: Building Cryptographic Trust into Dify’s Visual Workflows

Imran Siddique — Tue, 17 Feb 2026 19:49:26 GMT

Our AgentMesh Trust Layer was just merged into the Dify Marketplace. Here is what we built, why dynamic trust scoring changes everything, and what it looks like when governance becomes visible.

The Problem Nobody Talks About

Here is a question most multi-agent teams skip: When Agent A passes data to Agent B, how do you know Agent B is who it claims to be?

In traditional microservices, we solved this decades ago using mTLS, service mesh certificates, and RBAC. Yet, in the AI agent world, we have regressed to simply trusting the system. If Agent B claims to be the summarizer, it is blindly handed customer data.

This is the exact gap we closed in Dify with the AgentMesh Trust Layer plugin (merged via PR #2060).

The Four Pillars of the Trust Layer

The plugin introduces four specific tools that operate directly as nodes on Dify’s visual workflow canvas:

1. get_identity — Issue an Agent Passport: Every agent receives an Ed25519 cryptographic identity—a Decentralized Identifier (DID) backed by a public/private keypair. This is a cryptographically verifiable credential, not just a string label.

2. verify_peer — Check Who You Are Talking To: Before trusting data, this node verifies the peer's Ed25519 signature, validates the DID, and confirms the required capabilities. If verification fails, the workflow deterministically stops.

3. verify_step — Gate Nodes by Capability: Drop this node before any sensitive operation to check if an agent is authorized. You can literally see the governance gate on the Dify canvas explicitly blocking unauthorized paths.

4. record_interaction — The Trust Economy: Every agent starts with a neutral trust score of 0.5. Successes increase the score by +0.01, while failures drop it by a configured severity. If a hallucinating agent's score drops below 0.5, it is automatically quarantined by mathematics.

The Trust Stack: Two Levels of Scoring

The Dify plugin implements a simplified trust model designed for single-instance workflows, serving as an on-ramp to the full AgentMesh engine:

Why Visual Governance Matters

Dify’s visual canvas makes governance tangible. In code-only frameworks, governance is middleware that logs in the background. In Dify, a verify_step node sits visibly between an LLM call and tool execution. Security teams can open the workflow and instantly understand the safety architecture without reading a single line of code.

The Bigger Picture

This merge is part of a broader push to make governance a default layer across the entire AI ecosystem:

✅ Merged: Dify, LlamaIndex, Microsoft Agent-Lightning.
🔄 Open Proposals: CrewAI, AutoGen, LangGraph, Google ADK, Semantic Kernel, and more.

Governance should not be a separate product bolted onto a system; it should be a first-class middleware node in every framework.

Try It

Install: Search “AgentMesh Trust Layer” in the Dify plugin marketplace.
Source Code: Available on GitHub at imran-siddique/agent-mesh.

The End of Implicit Trust: Bringing Cryptographic Identity to LlamaIndex Agents

Imran Siddique — Thu, 12 Feb 2026 21:10:46 GMT

In a production environment — especially in finance, healthcare, or enterprise data — allowing an LLM to blindly accept context from another agent is a security vulnerability.

“Implicit trust” (where Agent A assumes Agent B is friendly because they share a runtime) is no longer sufficient.

Today, we are announcing the Agent Mesh integration (llama-index-agent-agentmesh). This is a fundamental hardening of the agentic stack, moving from “experimental swarms” to governed, identity-backed meshes.

The Core Shift: Identity vs. Credentials

Most agent frameworks treat identity as a static string. We are taking a different approach by separating Who you are from Your right to act.

With this integration, we are introducing a dual-layer security model:

Persistent Identity: The CMVKIdentity acts as the agent's permanent, cryptographic "soul." It does not change.
Ephemeral Credentials: The underlying Agent Mesh core manages the lifecycle. While the identity is static, the credentials used to sign requests have a strict 15-minute TTL by default.

This means that even if an agent’s keys were theoretically compromised, they would be useless within minutes. The system handles zero-downtime rotation automatically — a standard previously reserved for high-end microservices, now available for AI agents.

The Protocol: Verify, Then Trust

The integration forces a “Verify, Then Trust” workflow using TrustedAgentWorker and TrustGatedQueryEngine.

The Handshake: Before any data is exchanged, agents perform a cryptographic handshake. The TrustHandshake protocol verifies the peer's signature against the AgentRegistry—our "Yellow Pages" for trusted DIDs.
Sponsor Accountability: Every action is traced back to a sponsor_email via the Delegation Chain. You might not know which user triggered the agent yet, but you will always know who deployed it and who is accountable for its actions.

How It Works

The code remains clean, but the security posture changes strictly. Here is how you wrap a standard query engine with the trust layer:

Python

from llama_index.agent.agentmesh import (
    CMVKIdentity,
    TrustedAgentWorker,
    TrustGatedQueryEngine,
)

# 1. Generate a verifiable identity 
# The integration handles the persistent identity; 
# the mesh core manages the 15-min credential rotation.
identity = CMVKIdentity.generate('research-agent', capabilities=['search'])

# 2. Create an agent that requires this identity
worker = TrustedAgentWorker.from_tools(
    tools=[search_tool],
    llm=llm,
    identity=identity,
)

# 3. Gate your data access
# The engine will now REJECT queries from agents without 
# valid, unexpired credentials.
trusted_engine = TrustGatedQueryEngine(
    query_engine=base_engine,
    identity=identity,
)

What’s Next: The Road to OBO

While this release solves Agent-to-Agent trust and Sponsor accountability, we are already looking ahead. The current architecture secures the pipeline, but the next frontier is On-Behalf-Of (OBO) flows — passing the end-user’s context through the mesh to enforce granular, per-user access control.

For now, this integration ensures that your agents are no longer anonymous scripts running in the dark. They are verifiable, accountable services ready for production.

Check out the code in Pull Request #20644.