Venice AI + Preflight
Private AI guardrails for private AI inference. Every prompt encrypted, every policy hidden, verifiable decisions that reveal nothing.
Inference privacy starts with Venice. Venice runs your agent's inference inside a hardware-isolated TEE (Intel TDX paired with NVIDIA GPU attestation) that no party, not even Venice itself, can see into. Your prompt is encrypted on-device using a key bound to the verified enclave, decrypted only after the enclave proves its identity through remote attestation, and the response is signed by the same hardware on the way back.
But inference privacy is only one of three pillars.
Policies that provide guardrails for agent actions are generally not private. Typically, when your agent decides to act (transfer funds, call an API, sign a contract, move data), that action gets evaluated against a policy using guardrail systems that leak the policy in the act of enforcing it. LLM judges return verbose explanations that reveal the rules. Allow/block lists tell the action-taker exactly which rule they failed. Audit trails store the policy in plaintext for anyone with read access. The policy is supposed to be your security boundary, but conventional enforcement turns it into a published spec the moment it fires.
By contrast, Preflight keeps the policy private at enforcement, at verification, and in audit.
Your policy rules are compiled to SMT-LIB formal logic and live only on Preflight's side. The party being checked never sees the policy. The auditor verifying a decision never sees the policy. The counterparty confirming compliance never sees the policy.
Every party downstream of enforcement receives only a SAT/UNSAT result, a generic reason, and a proof identifier. Each action runs through a three-stage pipeline: a local extraction model parses the action text into structured variables, an SMT-LIB-compatible solver evaluates those variables against the compiled policy, and an Automated Reasoning pass independently translates the raw action text into formal logic and checks it against the same policy. A consensus rule reconciles the outputs. The system fails closed: any UNSAT from any path blocks the action. When the Automated Reasoning pass returns TRANSLATION_AMBIGUOUS, the action can still clear if the extractor and the solver both confirm SAT. None of the pipeline stages leak the policy in their outputs. Each decision is sealed into a SNARK that anyone can verify cryptographically without seeing the action, the policy, or any business data.
Three privacy pillars combine end to end today, with a fourth on the near-term roadmap:
Inference privacy (Venice E2EE). Venice cannot read your prompts.
Policy privacy (Preflight). The agent and the action-taker cannot read your rules.
Verification privacy (Preflight). Auditors, counterparties, and regulators verify each decision cryptographically without ever seeing your rules, your action, or any business data.
Extraction-layer privacy (JOLT Atlas, roadmap). Action text and extracted variables stay private from ICME's pipeline at check time, once JOLT Atlas zkML proofs are integrated into Preflight's checkIt pipeline.
Status. This page describes the Venice + Preflight integration as designed. The deployed systems are the source of truth: Venice's TEE/E2EE attestation surface at
docs.venice.ai, Venice's model registry atGET /v1/models, and Preflight's public verification endpoint atapi.icme.io/v1/verifyProof. Implementation details may evolve. If anything here conflicts with either system's actual behavior, the deployed system wins.
The full privacy stack
Most Venice integrations stop at inference privacy. That leaves the policy exposed at enforcement time and again at verification time. Both are often more sensitive than the prompt itself. The full stack covers three deployed pillars today and one extraction-layer pillar on the near-term roadmap.
Inference privacy
Prompts, outputs, chain-of-thought
Venice, GPU operators, network observers
Venice E2EE. Encrypted on-device, decrypted only inside an attested TEE
Policy privacy
Rules, logic, structure
The agent, the action-taker, counterparties
Preflight. Policy compiled to SMT-LIB and never returned to any caller
Verification privacy
Action, policy, rules, solver internal state
Auditors, counterparties, regulators
SNARK-backed receipts. Verifiers see only valid, policy_hash, claimed_result, used, plus proof metadata
Extraction-layer privacy (roadmap)
Action text and extracted variables
ICME's pipeline at check time
JOLT Atlas zkML proofs over the extraction step. JOLT Atlas is functional open-source software today (arXiv:2602.17452, benchmarked GPT-2 / nanoGPT proofs); production integration into Preflight's checkIt pipeline is the remaining engineering work
A note on Venice privacy modes. Venice offers four privacy tiers: Anonymous (frontier models proxied with identity stripped), Private (zero data retention enforced contractually), TEE (Pro: hardware-isolated enclaves with remote attestation operated by external partners NEAR AI Cloud and Phala Network), and E2EE (Pro: end-to-end encrypted, decrypted only inside an attested enclave). The code samples below use E2EE because it is the only mode that gives cryptographic, third-party-verifiable inference privacy. If you do not need attestation, you can substitute a Private-tier model and skip the attestation steps; privacy guarantees in that case are contract-based rather than hardware-verified.
A note on E2EE and tool calling. E2EE mode disables function calling, web search, file uploads, and the Venice system prompt (per Venice's documented limitations). E2EE is the right choice for any flow where the model emits a structured action as text and your application interprets it, including every code sample below. If you want the model itself to dispatch tool calls inside a single turn (Venice agentic chat at chat/v2), use a TEE-mode model instead. You retain hardware-attested inference privacy; you give up client-side encryption, which is incompatible with the enclave dispatching tools on your behalf. The agentic chat pattern is covered in its own section near the end of this page.
A note on the proving system. Preflight's SNARKs are generated today by JOLT, ICME's adaptation of the JOLT zkVM (born at a16z Crypto and friends, originally developed by Arasu Arun at NYU, Srinath Setty at Microsoft Research, and Justin Thaler at a16z crypto and Georgetown University, per Cryptology ePrint 2023/1217). The zkVM produces a succinct cryptographic proof that the verification result was correctly computed against the compiled policy. Any third party can verify the proof against ICME's public verifyProof endpoint in sub-second time, with no API key, learning nothing beyond the receipt fields. The JOLT proving pipeline is functional and generates real cryptographic proofs today. It has not yet completed a formal security audit, and proofs are single-use against verifyProof (retrieve them promptly via GET /v1/proof/{id} if you need to retain or re-share).
A note on JOLT Atlas (zkML). Preflight's pipeline today has two stages: a local extraction model that maps action text to structured variables, and an SMT-LIB-compatible solver that evaluates those variables against the compiled policy. The JOLT zkVM cryptographically proves the second stage. Any verifier can confirm the solver correctly decided SAT or UNSAT given the extracted variables. The first stage, the extractor itself, runs today without a cryptographic proof, so verifiers currently trust that the extractor faithfully translated the action text into the variables the solver evaluated.
JOLT Atlas closes this gap. JOLT Atlas is ICME's zkML system built on the JOLT zkVM that produces cryptographic proofs of AI inference steps. Co-authored by ICME CTO Wyatt Benno and detailed in arXiv:2602.17452, it extends end-to-end cryptographic coverage to the extraction layer. When JOLT Atlas is integrated into the Preflight checkIt pipeline, a single SNARK will attest to the full chain: the extractor mapped these variables from the action text, and the SMT solver correctly evaluated those variables against the policy with this policy_hash.
JOLT Atlas also has implications for privacy, not just verification. zkML is a research direction that lets a model prove it produced a specific output on specific inputs without revealing the model internals or the inputs to the verifier. Applied to a policy-check pipeline, this composes naturally with Venice's privacy posture: Venice's TEE keeps inference inputs hidden from the infrastructure, and zkML over the extraction layer means a verifier can confirm the extractor ran correctly on inputs the firm committed to, without needing to see those inputs themselves. The boundary of what's "private from whom" depends on the integration; the JOLT Atlas research opens this direction up.
JOLT Atlas is open-source zkML software with published research (arXiv:2602.17452), benchmarked performance on real models (nanoGPT proofs in roughly 2.3 seconds end-to-end on a MacBook M3, GPT-2 at 125M parameters in roughly 17 seconds, approximately 100 times faster than ezkl on the same nanoGPT workload), and active development at github.com/ICME-Lab/jolt-atlas. Production integration into Preflight's checkIt pipeline is the remaining engineering work, currently on the near-term roadmap. Use all zkML with appropriate caution as the broader category matures.
A note on trust scoping. "No trust required" applies to third-party verifiers. Anyone holding a proof_id can verify cryptographically that a specific action was checked against a specific compiled policy, without seeing the policy or the action. Verification happens against ICME's public verifyProof endpoint, which any party can call without an API key. ICME's server sees the plain-English policy at the time you call /v1/makeRules, and sees the plain-English action at the time of /v1/checkIt. The cryptographic guarantees protect downstream verifiers, not the compile-time and check-time submission paths.
Setup
Install
Create your account and get an API key
Write your policy in plain English. Preflight compiles it to SMT-LIB. The compiled representation is never returned to any caller, agent, auditor, or counterparty.
The plain-English text you submit is the only place your rules ever appear in cleartext, and it is visible to ICME at compile time. After compilation, only the SMT-LIB representation is retained on Preflight's side, and it is never returned to any caller. The compiled policy is committed to a policy_hash that appears in every proof, so verifiers can confirm exactly which version of the rules decided each action without seeing the rules themselves.
Pick a Venice model. Confirm the canonical model ID for your privacy mode by calling GET /v1/models and checking model_spec.capabilities.supportsE2EE or supportsTeeAttestation. Common E2EE-mode IDs include the Qwen and GLM families with e2ee- prefixes (for example e2ee-qwen3-5-122b-a10b, e2ee-glm-5.1). Use whatever the registry returns at the time you build, not the example IDs in this page.
Set your environment variables
Quickstart
Core pattern. Venice E2EE inference (prompt private), attestation verification (enclave verified), Preflight check (policy private, decision cryptographically sealed), then anyone can independently verify the receipt without an API key (verification private).
Decision response from /v1/checkIt (agent-side, requires API key)
Receipt response from /v1/verifyProof (public, no API key required)
The verifyProof response is the privacy-preserving subset. It contains no action text, no policy text, no rule identifiers, and no business data. It tells the verifier exactly two things of substance: that the SNARK is cryptographically valid, and which compiled policy version decided the action (via policy_hash). The verifier holds no API key and runs the same check anyone else with the proof_id would run.
Financial agent
Autonomous agent managing on-chain payments. Venice E2EE keeps the reasoning cryptographically private. Preflight enforces the spending policy without exposing it. Counterparties and auditors verify each payment was authorized using only the proof_id. They learn the decision was correct. They learn nothing about why.
When you settle a transaction with a vendor, you share the proof_id along with the payment. The vendor calls verifyProof themselves and sees valid: true plus the policy_hash. They never see your spending limits, your approval thresholds, or any other clause of the procurement policy.
Hermes Agent on Venice
Venice officially integrates with Hermes Agent by Nous Research. Point Hermes at Venice for inference and wrap its tool calls with Preflight before execution. The agent itself is treated as untrusted from the policy's perspective. It sees only approved or blocked, never the rules.
Because Hermes performs tool calling, use a TEE-mode model (not E2EE) for this flow. Confirm the canonical TEE model ID via GET /v1/models.
Treating the agent itself as untrusted matters because compromised or jailbroken agents otherwise become a side channel for the policy. With Preflight, even a fully adversarial agent cannot extract the rules. It can only learn whether each specific action it submitted was permitted, which reveals nothing about the rules themselves.
x402 agent: fully autonomous, fully private
Venice supports x402, the internet-native micropayment standard for agentic commerce. Wallets pay per request with USDC on Base. If the wallet has staked DIEM (Venice's API credit, minted by locking staked VVV), the daily DIEM allotment is consumed first and the wallet falls through to USDC when exhausted. No API key, no billing account, no human in the loop at inference. Combined with Venice E2EE and Preflight, every step preserves privacy: the prompt is encrypted, the policy is private, and the proof reveals nothing about either.
For wallet-native access without an API key, use the official Venice x402 client at github.com/veniceai/x402-client. It signs each request with a wallet-derived authorization and handles USDC top-ups on Base automatically.
Agentic chat: Venice chat/v2 + Preflight
Venice's agentic chat surface (live at venice.ai/chat/v2 as of May 2026) lets a model invoke a curated tool surface (web search, image generation, image edit, file parse, character chat, music) inside a single conversational turn. Because tool calls are now structured actions rather than text outputs, the policy boundary moves from "after generation, before execution" to "inside the agent loop, between each tool call."
This pattern uses a TEE-mode model. E2EE disables function calling, so the inference here is hardware-attested but not client-side encrypted. You retain inference privacy from Venice and from the GPU operators; you give up only the additional layer of encryption that would be incompatible with the enclave dispatching tools on your behalf.
Each in-loop check produces its own SNARK. The conversation accumulates a chain of proof_id values, all bound to the same policy_hash. An auditor reviewing the trail later can call verifyProof against each proof_id without an API key and confirm that every tool invocation passed the policy that was active at decision time.
The same pattern composes with the Venice MCP Server (at github.com/veniceai/venice-mcp-server) for MCP-based agent runtimes (Claude Code, Cline, and other MCP clients). Wrap the MCP tools/call boundary with Preflight; every MCP-mediated tool invocation is policy-checked, with a portable proof_id per call.
How privacy works at each layer
Pillar 1: Inference privacy (Venice E2EE). Your prompt is encrypted on-device with a key bound to a verified enclave. Venice routes the ciphertext to a hardware-isolated TEE (Intel TDX with NVIDIA GPU attestation, operated by NEAR AI Cloud or Phala Network). The enclave decrypts, runs inference, encrypts the response, and signs it. Remote attestation lets you verify the enclave is genuine before sending anything. No party outside the enclave, including Venice, sees the prompt or the response.
Pillar 2: Policy privacy (Preflight). Your policy is compiled to SMT-LIB and stored only on Preflight's side. Every party downstream of enforcement (the agent, the counterparty, your audit log readers, third-party verifiers) sees only a SAT/UNSAT result, a generic reason, and a proof identifier. The rule that fired stays private. The rules that did not fire stay private. Probing the system with many actions cannot enumerate the rules, because every response is structurally identical from the disclosure side. A three-stage pipeline checks each action against the same compiled logic: a local extraction model maps action text to structured variables that the SMT-LIB-compatible solver evaluates against the compiled policy, and an Automated Reasoning pass independently translates the raw action text to formal logic and checks it against the same policy. A consensus rule reconciles the outputs: fail-closed on any UNSAT, allow TRANSLATION_AMBIGUOUS to clear if the extractor and the solver both confirm SAT. None of the pipeline stages leak the policy in their outputs.
Pillar 3: Verification privacy (Preflight). Every decision ships with a SNARK that any third party can verify in sub-second time against ICME's public verifyProof endpoint. No API key required. The verifier passes only a proof_id and receives back the receipt fields (valid, policy_hash, claimed_result, used, plus proof metadata). They learn that the decision was correctly computed against a specific compiled policy version, and they learn nothing else. Not the action. Not the policy. Not which rule fired. The proofs are generated by JOLT, ICME's adapted zkVM, wrapping the SMT verification pipeline.
ICME's server sees the plain-English policy at /v1/makeRules submission time and the plain-English action at /v1/checkIt time. The cryptographic guarantees protect downstream verifiers, not the compile-time and check-time submission paths. JOLT Atlas, ICME's open-source zkML system, will extend cryptographic coverage to the extraction step itself, closing the trust gap where verifiers today have to trust that the extractor faithfully translated the action text into the variables the solver evaluated.
Verifier sees the policy
No
Verifier sees the specific failed rule
No
Verifier sees the action being checked
No
Verifier sees solver internal state
No
Third-party verifiers need an API key
No
Public verification endpoint
Yes (verifyProof)
Verification time
Sub-second on a single ICME endpoint
Proof composable into smart contracts
Yes
AI extraction step itself cryptographically proven
On roadmap (JOLT Atlas)
SAT means allowed. UNSAT means blocked. The policy that decided it stays private. The proof survives the audit, and any party can verify it without ever seeing what the audit was about.
Compliance alignment
Preflight produces a content-addressed cryptographic record of every agent action decision. This aligns with specific record-keeping obligations in current AI regulation:
EU AI Act Article 12 (logging): Each Preflight proof is an immutable record binding an action description to a
policy_hash. The compiled policy contents remain private to the firm; only the hash is published. Alignment with the logging obligation is one component of compliance, not a substitute for the broader conformity-assessment, oversight, and quality-management obligations the Act imposes.EU AI Act Article 50 (transparency): The
policy_hashmakes the agent's governing ruleset cryptographically identifiable without disclosure of contents.General audit trails: Any regulated industry that requires "who authorized this action and against what rules" gets a structured cryptographic answer instead of log files.
Resources
Last updated

