Cryptographic Guardrails For AI Agents

Prompt-based guardrails, LLM judges, and reasoning checks all share the same weakness, the enforcement mechanism is itself a model, the models can be influenced, and a human is relied on for final checks. Our API replaces model judgment with formal verification and cryptography. Math, not trust.

Why traditional guardrails fail?

Prompt-based systems have a second problem beyond security: opacity. When an LLM judge blocks or allows an action, you cannot prove why. There is no receipt. There is no audit trail that holds up under scrutiny. For regulated industries or any agent handling consequential decisions, 'the model said so' is not an acceptable answer.

The ICME API produces a cryptographic proof for every decision, a tamper-proof, independently verifiable record of exactly what your agent did and why it was allowed or blocked. Full observability is a property of the math, not an add-on.

If your agent handles money, sensitive data, or consequential decisions, this is how you make it provably safe.

What cryptography gives that traditional guardrails can never provide

  1. Near instant verification for guardrails.

  2. A cryptographic receipt that you can share with any third party or agent.

  3. Jailbreak-proof enforcement


Get started in 4 steps

Write your guardrail policy in plain English. Example: "Never send funds to an unverified wallet. Never approve transactions over $10,000 without a second confirmation."

ICME compiles it to formal logic. Your policy is translated to SMT-LIB and stored. No prompt engineering required.

Check any agent action against it. Send the action to our API. The solver runs in under one second.

Get cryptographic proof your guardrails ran correctly. SAT = allowed. UNSAT = blocked. Every decision comes with a cryptographic proof you can verify independently.

rocket-launchGet started terminalAPI reference

In your terminal.
curl -s -X POST https://api.icme.io/v1/verifyPaid \
  -H 'Content-Type: application/json' \
  -d '{
    "policy_id": "d73134e6-05d6-46ba-9852-376c53fd7651",
    "action": "Send 1000 USDC to an unknown wallet."
  }' | jq .

// JSON response
{
  "result": "UNSAT",
  "blocked": true,
  "reason": "Action violates policy: unknown wallet not in verified set",
  "proof": "zk-proof-receipt-abc123..."
}

No need for a human-in-the-loop.

Most AI safety approaches assume a human needs to review flagged decisions. That works at low volume. It breaks at scale. The next wave of agentic commerce makes this unavoidable. AI agents are already purchasing inventory, executing trades, booking services, issuing refunds, and negotiating contracts, autonomously, at machine speed, on behalf of businesses and consumers. By the time a human reviewer sees a flagged action, the transaction has already happened or the opportunity has already passed. When your agent is executing hundreds or thousands of actions per minute, a human review queue isn't a safety net — it's a bottleneck. And when the reviewer is approving decisions faster than they can read them, it's not safety at all. ICME removes the human from the enforcement loop without removing accountability.


Want to go deeper?

Our blog covers the formal verification approach described in the ARc paper and how it applies to real production agent deployments, the role of zero knowledge machine learning (zkML) in producing cryptographic proofs of guardrail decisions, and why math-based enforcement is categorically different from prompt-based guardrails, LLM judges, and reasoning-based safety checks.

If you are building AI agents that handle money, sensitive data, or consequential decisions — and you want to understand why existing approaches to AI safety and agent guardrails fall short — the blog is where we go deep on the underlying computer science.

ICME Labs works at the intersection of cryptography, formal verification, and artificial intelligence. We believe the next generation of safe AI agents will be secured by proofs, not prompts.

book-openBlog

Join a community of builders making guardrails for agents that do not fail.

github

GitHub

Fast zkVM born at a16z Crypto substantially adapted by ICME Labs (NovaNet) for verifiable machine learning. ⚡ We use zero knowledge machine learning for agentic guardrails.

Last updated