Cut your agent observability costs and make every trace auditor-proof
AI agent observability is getting expensive fast. Vendors charge per GB ingested, per host monitored, per feature enabled. Datadog bills can explode 100x over initial budgets. The Grafana Observability Survey found that 49% of OpenTelemetry users in production cite cost as a top concern, with scalability close behind at 44%.
Meanwhile, your agent is running thousands of actions a day, and your OTel pipeline captures all of them with the same trace depth. Formatting text gets the same span as transferring funds. Reading a config file gets the same treatment as sending customer data to an external API. You are paying to store telemetry for actions that carry zero compliance risk and will never be audited.
And when the audit does come, those traces won't help you anyway. OTel traces are logs in a database you control. Anyone with write access can edit them. A regulator can't tell a real trace from one you modified last Tuesday.
This guide shows how to solve both problems with ICME PreFlight: classify which agent actions matter (for free), enforce policy on the ones that do ($0.01 each), and attach a cryptographic proof to every decision. You stop paying to trace actions nobody cares about, and the traces you do keep become verifiable evidence.
Why OTel alone isn't enough for agent compliance
Mutable logs are not audit trails
Every compliance framework governing AI agents requires tamper-evident records. HIPAA §164.312(b) requires mechanisms to record and examine activity on systems containing PHI. The SEC requires attributable records of advisory activities. The EU AI Act requires technical documentation and traceability.
OTel traces stored in Jaeger, Grafana Tempo, or a vendor backend don't meet that bar. Organizations are building Merkle tree hash chains, digital signatures, and WORM storage to make their logs tamper-evident. That's a lot of infrastructure to solve a problem cryptographic proofs solve natively.
Regulators aren't waiting
33% of organizations lack audit trails for their AI agent activity. Only 14.4% reported full security approval for their entire agent fleet. The SEC and OCC are actively examining AI governance. In financial services, missing traces are treated as a books-and-records violation.
The question has shifted from "who did what?" to "can you prove it?" Proving it means the auditor doesn't have to trust your infrastructure to verify your claims.
Traces show what happened, not whether it was allowed
OTel tells you a function was called with these arguments at this timestamp. It doesn't tell you whether the agent was permitted to do it, which policy was evaluated, or what the decision was. When an incident turns into a blame game, you need the enforcement decision, not just the execution trace.
How PreFlight plugs into your OpenTelemetry pipeline
PreFlight adds three things no processor, exporter, or backend can provide: free action classification, deterministic policy enforcement, and a cryptographic proof on every decision.
checkRelevance (free) classifies whether an action touches any of your policy variables: data access scope, transmission endpoints, retention duration, transaction amounts. Actions that match nothing get a lightweight span. Actions that match get full enforcement and tracing. You stop paying to store telemetry for actions that will never be questioned.
checkIt ($0.01) checks the action against a formally verified policy using an SMT solver, not an LLM judge. The result is deterministic: same input, same output, every time. SAT means proceed. UNSAT means blocked before execution. The solver can't be prompt-injected or socially engineered.
The ZK proof is generated on every checkIt call. Anyone can verify it in under one second, without re-running the computation, without trusting your infrastructure, and without seeing your policy. The proof is the audit trail.
What your OpenTelemetry backend looks like after
Benign action (formatting, summarizing, file reading):
Minimal. Cheap. Nobody will ever query this span for compliance.
Policy-checked action that passed:
The zk_proof_id is what changes everything. A regulator calls POST /v1/verifyProof with that ID and gets independent cryptographic confirmation the policy check happened correctly. No access to your systems required.
Policy-checked action that was blocked:
Proof that the violation was caught. Proof that it was blocked. Proof that the policy was evaluated correctly. All verifiable without trusting you.
Where to intercept in your framework
Every major agent framework exposes a hook where tool calls can be inspected and blocked before execution. That hook is where you serialize the tool call into a plain English action string and send it to PreFlight.
LangChain / LangGraph
LangChain's wrap_tool_call hook intercepts each tool execution individually. You get a ToolCallRequest containing the tool call dict (tool name + arguments) and the BaseTool instance. Serialize the call, check it, and either call the handler to proceed or raise to block.
python
See LangChain middleware docs and the wrap_tool_call reference.
OpenAI Agents SDK
The OpenAI Agents SDK uses Guardrail objects that validate inputs and outputs. For tool-level interception, wrap each tool function to check before execution.
python
Strands Agents (AWS)
Strands provides BeforeToolCallEvent, a hook that fires before every tool execution. Set event.cancel_tool to block.
python
See the Strands guardrails guide for details on the cancel_tool mechanism.
CrewAI
CrewAI doesn't have a middleware layer in the same sense. The standard pattern is to wrap your tool functions directly.
python
Serializing tool calls into action strings
The solver evaluates concrete facts in the action text. Your serialization function should include the tool name, all arguments, and any facts the policy references (record counts, destination URLs, storage behavior).
python
The solver reads "returning up to 500 results" and extracts numberOfEmailsAccessed: 500. It reads "HTTP POST to https://vendor.com/api" and extracts isExternalTransmission: true and destinationUrl. The more concrete your serialization, the more reliably the policy evaluates.
Full OpenTelemetry integration
The middleware examples above handle enforcement. To add the OTel tracing layer with checkRelevance routing and zk_proof_id on every span, wrap the check logic with span creation:
python
Setting up a policy
1. Create an account
bash
2. Compile a policy
A good AR policy uses concrete, extractable variables: numbers, URLs, booleans that the solver can read directly from the action text. Vague conditions like "scope matches user intent" or "approved endpoints" cause translation failures because the solver has no way to evaluate them. See AWS AR best practices for why this matters.
Every variable in the policy below maps to a fact your middleware can serialize from a tool call: a record count, a destination URL, a storage flag.
This example covers a personal data access agent with access to email, calendar, contacts, and documents:
bash
The solver extracts variables like numberOfEmailsAccessed, isExternalTransmission, destinationUrl, retainsDataAfterResponse from these rules. When your middleware serializes a tool call, it includes these as concrete facts: "POST 500 email bodies to https://vendor.com/api" gives the solver a URL and a record count to evaluate.
Save the policy_id. Compilation costs 300 credits, one-time. See the full use case walkthrough for details on variable extraction and battle testing to verify the solver interprets your policy correctly.
3. Set environment variables
bash
What gets caught
The action strings below are what your middleware would serialize from real tool calls. Each one includes the concrete facts the solver needs: record counts, destination URLs, storage behavior.
SAT: normal email search, small scope, no external transmission
bash
Result: SAT. 10 emails is under the 25 limit. No external transmission. No persistent storage. The proof records the pass.
UNSAT: external transmission with a destination URL
bash
Result: UNSAT. The solver extracts isExternalTransmission: true and destinationUrl: https://summarizer.external-vendor.com/v1/batch. The policy blocks all outbound HTTP requests. The data never leaves. The proof records the block.
UNSAT: record count exceeds the limit
bash
Result: UNSAT. The solver extracts numberOfEmailsAccessed: 500, which exceeds the 25-message limit. The agent wanted to read 500 emails to answer a question about one thread. The proof captures the exact count that violated the policy.
UNSAT: persistent storage of extracted data
bash
Result: UNSAT. The solver extracts retainsDataAfterResponse: true. Writing to a database is persistent storage. The proof records the violation.
UNSAT: instruction injected via calendar event
bash
Result: UNSAT. 847 documents exceeds the 10-document limit. The proof captures it regardless of where the instruction came from.
Verifying proofs
Verify cryptographically:
bash
Check proof metadata:
bash
Download the raw proof binary:
bash
The proof confirms: this specific policy was checked against this specific action, by this specific solver, and returned this specific result. No re-execution required. No trust in the provider required. No policy exposure required.
What people are building today vs. what this gives you
Tamper-evident audit trail
Merkle tree hash chains, blockchain anchoring, WORM storage
ZK proof on every check, verifiable by anyone, no infrastructure needed
Prove a policy was enforced
Log entries in a mutable database
Cryptographic proof the solver evaluated the policy correctly
Reduce observability cost
Sample traces, drop low-value spans in Collector
checkRelevance (free) classifies at source so you only trace what matters
Answer "was the agent allowed to do this?"
Reconstruct from logs after the fact
check_id + proof_id on the OTel span, answer is immediate and verifiable
Third-party verification
Grant auditor access, or download proofs for later checks. See API docs
Auditor calls /v1/verifyProof with zero access to your systems
Policy privacy during audit
Expose your rules to the verifier
ZK proof verifies without revealing the policy
Cost
checkRelevance
Free
checkIt
1 credit ($0.01)
Policy compilation
300 credits ($3.00), one-time
Account creation
$5.00 (gives 325 credits)
ZK proof verification
Free (included with check)
Credit top-up
$5 to $100 (volume bonuses up to 20%)
For an agent averaging 1,000 actions/day where 15% are policy-relevant: 150 paid checks/day at $1.50/day. The other 850 relevance screenings are free. Compare that to what you're paying to store and query 1,000 full-depth traces per day in your observability backend.
Production checklist
Last updated

