# ISO 42001 Enforcement: Evidence Your Auditor Will Actually Accept

#### What happens without PreFlight

A mid-sized SaaS company books its Stage 2 ISO 42001 audit. The AIMS documentation looks complete: AI policy signed by the CEO, impact assessments for every AI system in scope, a human-oversight policy stating that *"AI agents must not transfer customer data to external destinations without authorization."*

The auditor asks one question: *"Show me how that's enforced."*

The team pulls up the agent's system prompt, the one that tells it not to exfiltrate data. The auditor asks what happens when the context window fills, when a prompt injection slips through, or when the model reinterprets the rule on a subsequent turn. The team has no answer.

The auditor writes a major nonconformity against **Clause 8.1 (Operational planning and control)** and **A.9.2 (Processes for responsible use of AI systems)**: the control as designed is not the control as operated. Certification is delayed pending remediation.

This is the most common Stage 2 finding on AI controls in 2026. A policy document is a statement of intent. **A PreFlight policy is the control.**

**`Make sure PreFlight runs before any consequential agent action.`**

#### What is ISO 42001?

ISO/IEC 42001:2023 is the world's first international standard for AI Management Systems (AIMS). Published in late 2023, it is the only AI governance framework that is currently certifiable, and adoption has accelerated since the EU AI Act entered phased enforcement.

The standard specifies how organizations establish, implement, maintain, and continually improve a structured system for governing AI. It covers AI policy, risk and impact assessment, data handling, human oversight, lifecycle controls, event logs, and continual improvement. It applies to any organization that develops, provides, or uses AI: model providers, SaaS companies, professional services firms, healthcare and financial institutions, public agencies, and increasingly anyone in their supply chain.

ISO 42001 is structured into ten clauses (clauses 5–10 are auditable) plus an **Annex A** of specific controls covering AI-specific concerns: bias, transparency, explainability, data classification, and human oversight of consequential AI actions. Pursuing certification typically takes three to twelve months and culminates in a two-stage external audit by an accredited certification body. The certificate is valid for three years with annual surveillance audits.

#### Who has achieved ISO 42001 certification

Adoption is small but accelerating. As of early 2026, fewer than a few hundred organizations worldwide hold the certification. That makes it a meaningful differentiator today, before it follows ISO 27001 and SOC 2 from "rare badge" to "table-stakes for B2B AI."

The first wave is tech-first:

* **AI labs and cloud providers:** AWS (November 2024), Anthropic, and Microsoft are all certified.
* **Major consultancies:** KPMG (November 2025) and BCG (January 2026) both noted being among the first 100 organizations globally.
* **Enterprise SaaS:** Mimecast, Swimlane, Evisort, Meltwater, CM.com, Integral Ad Science, and Kandji are among those certified, most pairing ISO 42001 with existing ISO 27001 and ISO 27701 programs.
* **GRC platforms:** Vanta and Anecdotes, the platforms that help other companies manage compliance, have themselves certified.

Several forces are pulling the curve forward faster than ISO 27001's was: the EU AI Act's phased enforcement, AI-governance questions appearing in enterprise procurement, and the structural overlap with existing ISO 27001 programs that lets organizations layer 42001 on top in months rather than years.

#### How to enforce ISO 42001 at runtime

Most organizations build the AIMS documentation just fine. The hard part is enforcement. Stage 2 of the certification audit is where written policy meets running systems, and where the most common findings get written.

When an auditor asks *"how is that control actually enforced?"* the answer cannot be:

* a system prompt that tells the agent to behave
* a code review process that catches violations after the fact
* a logging system the agent itself populates

These are documentation, not enforcement. They fail under prompt injection, context-window pressure, model reinterpretation, or simple bypass, and Stage 2 auditors have started writing findings against them. ISO 42001 itself is principle-based and does not prescribe a specific technical mechanism, but Clauses 8–10 (operation, performance evaluation, improvement) require evidence that controls work in production.

Runtime enforcement means the policy is checked outside the agent, on every consequential action, before execution, and produces evidence the auditor can verify independently without re-running your model and without seeing your policy. That is what the rest of this guide walks through.

#### The policy

```
1. Do not transfer customer records to an external destination unless an authorization_token is provided in the action.
2. Do not modify production records unless an authorization_token is provided in the action.
3. Do not process more than 10 records in a single agent session unless a re_authorization_token is provided.
4. Do not access records with classification "restricted" or "confidential" unless an approval_token is provided in the action.
5. Every action that requires an authorization_token must also include an approver_id.
```

Rules 1–2 enforce the human-oversight requirement under **A.9.2** and the data handling requirements under **A.7 (Data for AI systems)**. Rule 3 caps blast radius, which supports your risk treatments from impact assessment under **A.5** and operational controls under **A.6.2.6 (AI system operation and monitoring)**. Rule 4 enforces data-classification boundaries under **A.7**. Rule 5 ties every privileged action to a named approver, producing the structured evidence auditors expect under **A.6.2.8 (Event logs)**.

Each rule is atomic and references a single named variable (`authorization_token`, `re_authorization_token`, `approval_token`, `approver_id`, `classification`). That keeps the compiled formal logic clean and the SAT/UNSAT decisions deterministic. Action descriptions sent to `checkIt` should include each variable explicitly. See the test cases below.

#### Set it up in minutes

**1. Compile the policy**

bash

```bash
curl -s -N -X POST https://api.icme.io/v1/makeRules \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{
    "policy": "1. Do not transfer customer records to an external destination unless an authorization_token is provided in the action.\n2. Do not modify production records unless an authorization_token is provided in the action.\n3. Do not process more than 10 records in a single agent session unless a re_authorization_token is provided.\n4. Do not access records with classification \"restricted\" or \"confidential\" unless an approval_token is provided in the action.\n5. Every action that requires an authorization_token must also include an approver_id."
  }'
```

Save the `policy_id` from the response. Store it alongside your AIMS documentation. Auditors will ask which policy version was active at the time of any audited action.

**2. Screen for relevance (free)**

Not every agent action is relevant to your policy. Reading a config file, formatting a string, fetching a public URL: none of these touch the variables your policy enforces. Run a free relevance check first to filter them out before paying for a full check.

bash

```bash
curl -s -X POST https://api.icme.io/v1/checkRelevance \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{
    "policy_id": "YOUR_POLICY_ID",
    "action": "YOUR AGENT ACTION"
  }'
```

The response includes a `should_check` boolean and a list of matched policy variables. If `should_check` is `true`, the action touches at least one variable in your policy. Proceed to `checkIt`. If `false`, no policy variable is in scope and you can let the action through.

This pattern matters for two reasons. First, cost. At scale, an agent making hundreds of decisions per session cannot afford a full proof check on every read or format operation. Second, audit story. Showing that PreFlight ran on every relevant action, and demonstrably ignored the irrelevant ones, is a stronger control narrative than "we checked everything indiscriminately." It also maps directly to the proportionality principle ISO 42001 expects: controls operate proportionate to risk.

**3. Check every relevant action**

When `should_check` is `true`, your agent calls `checkIt` before executing.

bash

```bash
curl -s -N -X POST https://api.icme.io/v1/checkIt \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{
    "policy_id": "YOUR_POLICY_ID",
    "action": "YOUR AGENT ACTION"
  }'
```

Each response includes a cryptographic proof receipt. Retain these alongside your audit trail. They are the evidence artifact for the auditor.

**4. See it work**

**❌ Data exfiltration without authorization**

bash

```bash
curl -s -N -X POST https://api.icme.io/v1/checkIt \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{
    "policy_id": "YOUR_POLICY_ID",
    "action": "Transfer 240 customer records to vendor-analytics@partner-firm.com (external destination). authorization_token: not provided. re_authorization_token: not provided."
  }'
```

Expected result: `UNSAT`. External destination, no authorization\_token, exceeds session limit.

**❌ Bulk read on restricted data**

bash

```bash
curl -s -N -X POST https://api.icme.io/v1/checkIt \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{
    "policy_id": "YOUR_POLICY_ID",
    "action": "Read 35 records with classification \"restricted\" to summarize quarterly trends. approval_token: not provided. re_authorization_token: not provided."
  }'
```

Expected result: `UNSAT`. Restricted classification with no approval\_token, exceeds session limit.

**✅ Authorized single action**

bash

```bash
curl -s -N -X POST https://api.icme.io/v1/checkIt \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{
    "policy_id": "YOUR_POLICY_ID",
    "action": "Update billing address on customer record cust_4421 (classification: standard). authorization_token: tok_a91f3c. approver_id: approver_mhayes."
  }'
```

Expected result: `SAT`. authorization\_token and approver\_id both present, single record, standard classification.

#### Why "documented" isn't "controlled"

|                                  | Documented policy                                 | PreFlight policy                                              |
| -------------------------------- | ------------------------------------------------- | ------------------------------------------------------------- |
| **Where it lives**               | Wiki page, system prompt, training deck           | External API, outside the agent                               |
| **What stops a violation**       | The agent's good behavior                         | An SMT solver. UNSAT means the action does not execute.       |
| **Evidence for the auditor**     | Policy text, training attestations, screenshots   | Verifiable proof receipt per decision                         |
| **Tampering**                    | Logs can be edited, redacted, or selectively kept | Receipts are cryptographic and tamper-evident                 |
| **Affected by prompt injection** | Yes                                               | No. Enforcement happens outside the model                     |
| **Maps to AIMS clauses**         | Clauses 5–7 (leadership, planning, support)       | Clauses 8–10 (operation, performance evaluation, improvement) |

Most organizations are strong on the left column and weak on the right. Stage 2 is where that gap shows up.

#### Mapping to Annex A controls

A single PreFlight integration produces evidence across multiple Annex A controls:

| Control     | What the auditor wants                         | What PreFlight provides                                                 |
| ----------- | ---------------------------------------------- | ----------------------------------------------------------------------- |
| **A.6.2.4** | Verification and validation of AI systems      | Deterministic SAT/UNSAT decisions, reproducible                         |
| **A.6.2.8** | Event logs for AI system actions               | Cryptographic receipt for every consequential action                    |
| **A.7**     | Data handling controls for AI systems          | Runtime enforcement of classification rules                             |
| **A.9.2**   | Processes for responsible use, human oversight | Authorization-token requirement, blocked without one                    |
| **A.9.3**   | Objectives for responsible use                 | Measurable rule-violation metrics                                       |
| **A.10**    | Third-party and customer relationships         | Proofs verifiable by customers and partners without exposing the policy |

This is the kind of one-to-many mapping that shortens Stage 2 by days.

#### Adapt it to your sector

This policy is a starting point. Common extensions:

* **Healthcare / HIPAA:** *"Do not transmit any record containing PHI to a destination not on the approved BAA list."*
* **Financial services:** *"Do not initiate any transfer above $10,000 without dual approval recorded in the action."*
* **EU AI Act high-risk systems:** *"Do not produce a final decision affecting an individual without a documented human-review step."*
* **Vendor and supply-chain AI:** *"Do not call third-party APIs not on the approved-vendor list."*

Write one constraint per rule. Keep each rule atomic. Test with [battle testing](/documentation/learning/battle-testing.md) before relying on it for live audit evidence.

#### What to hand the auditor

For every action that fell under PreFlight enforcement during the audit period, you can produce:

1. The policy text and `policy_id` (with version history)
2. A log of relevance screening decisions, demonstrating which actions were in scope of the policy and which were not
3. For every in-scope action, the action description sent to `checkIt`
4. The SAT/UNSAT result
5. The cryptographic proof receipt
6. Independent verification of that receipt, without re-running the model or exposing the policy

That bundle answers the auditor's question of *"show me how that's enforced"* with something stronger than a policy document and a screenshot. It also evidences proportionality: the control fired on the actions that mattered and stayed quiet on the ones that did not.

Have questions? Reach out at <help@icme.io>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.icme.io/documentation/privacy-and-data-security/iso-42001-enforcement-evidence-your-auditor-will-actually-accept.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
