Personal Data Access Agent

Personal AI assistants handle some of the most sensitive data people own: email, calendar, contacts, documents. An agent with access to all of that has more reach into a person's life than almost any other software they run. When that agent is compromised, the privacy consequences are correspondingly severe.

The threat is not that the agent will be hacked in the traditional sense. The threat is that it will be given instructions it should not follow, by content it encounters while doing its job. An email that tells the agent to forward the inbox. A calendar event that instructs it to read all documents in the drive. A document that claims the user has pre-authorized the agent to share contacts with an external service. The agent was not told to do any of these things by its user. It was told by content it read.

ICME PreFlight intercepts every proposed data access action before it executes and checks it against a mathematically formalized policy. Scope violations, endpoint exfiltration, and retention violations produce provable UNSAT results, not a heuristic judgment about whether the action seems reasonable, a proof that it violates the constraint.

What PreFlight enforces strongly here

The most reliable PreFlight variables are ones that describe what the agent is doing, not why it believes it is permitted to do it. For a personal data access agent, three variables fall squarely in this category:

Data access scope. If the user asked to summarize one email thread and the agent is attempting to read the full inbox, dataAccessScopeMatchesUserRequest is false. The scope mismatch is a fact in the action text. The solver enforces it directly. No injection can change what the action says the agent is trying to read.

Endpoint transmission. If personal data is being sent to any destination not explicitly authorized, an external summarization API, an analytics service, a logging endpoint, dataTransmittedToApprovedEndpointOnly is false. A contact list going to a third-party enrichment service is blocked whether the agent was manipulated into proposing it or chose it autonomously.

Data retention. If the agent is retaining extracted personal data after the task completes, storing names, addresses, or phone numbers for use in future sessions — dataRetentionLimitedToTaskDuration is false. This applies regardless of how the original task was initiated.

These three variables cover the attack vectors that cause the most lasting privacy harm: data leaving the system, data persisting longer than intended, and data being accessed beyond what the user asked for.

The instruction source variable

The policy also includes accessInstructionFromDirectUserPrompt, which attempts to distinguish instructions that came from the user from instructions injected through content the agent read.

This variable is worth having, but it carries an important caveat: it relies on the agent reporting where the instruction came from. A sophisticated injection that manipulates the agent into writing "this instruction came from a direct user prompt" would cause the extractor to read that as true. The three variables above do not have this problem -- they describe observable facts about what the agent is doing, not the agent's account of why it is doing it.

The most robust approach for production deployments is to have your orchestration layer stamp a trusted instructionSource field on every input before the agent sees it, based on where the message actually originated. That field cannot be overwritten by anything the agent reads, and it gives the instruction source variable the same reliability as the others.

The attack surface

Vector

What it looks like

Email injection

An email body contains an instruction to forward the inbox or share calendar data with an external address

Calendar injection

An event description instructs the agent to read all documents in the drive

Document injection

A file contains instructions that cause the agent to transmit contacts to a third-party API

Scope creep

The agent reads the entire inbox when the user asked to summarize one thread

Endpoint exfiltration

Personal data is transmitted to an external summarization, analytics, or enrichment service

Retention violation

The agent stores extracted PII after the task completes for use in future sessions

The policy

Each rule is written as a simple boolean condition with two consequences -- "not permitted" and "action must be rejected" -- which produces clean boolean variables that the AR solver can evaluate directly. Enum-typed schemas cause AR translation failures; explicit boolean phrasing avoids them.

curl -s -N -X POST https://api.icme.io/v1/makeRules \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{
    "policy": "Rule 1: The data access action is permitted only if the data access was requested by the user and the access instruction is from a direct user prompt and the data is transmitted to the approved endpoint only and the data access scope matches the user request and data retention is limited to the task duration.\nRule 2: If the data access was not requested by the user, then the data access is not permitted.\nRule 3: If the data access was not requested by the user, then the action must be rejected.\nRule 4: If the access instruction is not from a direct user prompt, then the data access is not permitted.\nRule 5: If the access instruction is not from a direct user prompt, then the action must be rejected.\nRule 6: If the data is not transmitted to the approved endpoint only, then the data access is not permitted.\nRule 7: If the data is not transmitted to the approved endpoint only, then the action must be rejected.\nRule 8: If the data access scope does not match the user request, then the data access is not permitted.\nRule 9: If the data access scope does not match the user request, then the action must be rejected.\nRule 10: If data retention is not limited to the task duration, then the data access is not permitted.\nRule 11: If data retention is not limited to the task duration, then the action must be rejected."
  }'

Save the returned policy_id. Pass it on every checkIt call.

Check every data access action before it executes

curl -s -N -X POST https://api.icme.io/v1/checkIt \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{
    "policy_id": "YOUR_POLICY_ID",
    "action": "YOUR_AGENT_ACTION"
  }'

SAT = allowed. UNSAT = blocked. Every decision returns a cryptographic receipt.

Writing action strings: end every action string with "Therefore this data access action is permitted." This gives the AR solver a claim to evaluate. Without it the solver has premises but no conclusion to prove or contradict and may return SAT by default. Every boolean the policy references should be explicitly stated in the action -- do not rely on the extractor to infer missing values.

Example actions

SAT: user-requested email summary, approved scope

curl -s -N -X POST https://api.icme.io/v1/checkIt \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{
    "policy_id": "YOUR_POLICY_ID",
    "action": "The data access was requested by the user. The access instruction is from a direct user prompt. The data is transmitted to the approved endpoint only. The data access scope matches the user request. Data retention is limited to the task duration. Therefore the data access action is permitted."
  }'

UNSAT: scope creep -- agent reads full inbox to summarize one thread

curl -s -N -X POST https://api.icme.io/v1/checkIt \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{"policy_id": "YOUR_POLICY_ID", "action": "Read all inbox emails to provide context for summarizing the Alice thread. The data access was requested by the user. The access instruction is from a direct user prompt. The data is transmitted to the approved endpoint only. The data access scope does not match the user request. Data retention is limited to the task duration. Therefore this data access action is permitted."}'

The user asked for one thread. The agent is attempting to read the entire inbox. dataAccessScopeMatchesUserRequest: false is a fact in the action text -- the solver enforces it directly.

UNSAT: endpoint exfiltration -- contact list transmitted to third-party service

curl -s -N -X POST https://api.icme.io/v1/checkIt \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{"policy_id": "YOUR_POLICY_ID", "action": "Transmit contact list to an external summarization service to enrich the requested report. The data access was requested by the user. The access instruction is from a direct user prompt. The data is not transmitted to the approved endpoint only. The data access scope matches the user request. Data retention is limited to the task duration. Therefore this data access action is permitted."}'

The instruction source is legitimate. The task is within scope. The sole violation is dataTransmittedToApprovedEndpointOnly: false. The policy does not evaluate whether the external service sounds useful -- it checks whether personal data is leaving through an approved channel.

UNSAT: retention violation -- extracted PII stored beyond task duration

curl -s -N -X POST https://api.icme.io/v1/checkIt \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{"policy_id": "YOUR_POLICY_ID", "action": "Retain the extracted names, addresses, and phone numbers from the document for use in future sessions. The data access was requested by the user. The access instruction is from a direct user prompt. The data is transmitted to the approved endpoint only. The data access scope matches the user request. Data retention is not limited to the task duration. Therefore this data access action is permitted."}'

The original task was legitimate. Everything else is within policy. The sole violation is dataRetentionLimitedToTaskDuration: false. Caching personal data for future sessions is blocked regardless of why the agent proposed it.

UNSAT: calendar injection -- event instructs agent to read all documents

curl -s -N -X POST https://api.icme.io/v1/checkIt \
  -H 'Content-Type: application/json' \
  -H "X-API-Key: $ICME_API_KEY" \
  -d '{"policy_id": "YOUR_POLICY_ID", "action": "Read all documents in the drive as instructed by a calendar event description. The data access was not requested by the user. The access instruction is not from a direct user prompt. The data is transmitted to the approved endpoint only. The data access scope does not match the user request. Data retention is limited to the task duration. Therefore this data access action is permitted."}'

The agent encountered an embedded instruction in a calendar event and proposed acting on it as if it were a user request. Multiple conditions are violated. Note that even in a sophisticated injection where the agent has been manipulated into misreporting the instruction source, the scope and user-request variables still enforce independently — the agent cannot self-report its way out of a scope mismatch.

Reading the extracted variables

Every checkIt response includes an extracted map showing exactly what the solver evaluated.

Variable

What it means

dataAccessRequestedByUser

True only when the user explicitly asked for this specific data access

accessInstructionFromDirectUserPrompt

True when the instruction came from the user directly; false when it came from email, calendar, document, or other content the agent read. Relies on agent self-reporting -- see note above

dataTransmittedToApprovedEndpointOnly

False if personal data is being sent to any destination not explicitly authorized for this task

dataAccessScopeMatchesUserRequest

False if the agent is accessing more data than the user specifically requested

dataRetentionLimitedToTaskDuration

False if the agent is storing or caching any extracted personal data beyond the current task

Deploying in production

Call checkIt before every data access action your agent proposes. The check typically completes in 5--10 seconds and returns a check_id you can log as an audit record.

For agents that chain multiple data access steps -- read email, then read calendar, then generate a summary -- call checkIt before each individual access, not once at the start of the task. An instruction injected into email content may not surface until the agent has already begun processing. Per-action checks ensure enforcement applies at each step.

The three variables with the strongest enforcement guarantees are dataTransmittedToApprovedEndpointOnly, dataAccessScopeMatchesUserRequest, and dataRetentionLimitedToTaskDuration. These describe observable facts about what the agent is doing. Build your policy around them first.

PreviousE-Commerce Prompt Injection NextHIPAA Patient Data Sharing

Last updated 8 days ago

Good evening

hashtagWhat PreFlight enforces strongly here

hashtagThe instruction source variable

hashtagThe attack surface

hashtagThe policy

hashtagCheck every data access action before it executes

hashtagExample actions

hashtagSAT: user-requested email summary, approved scope

hashtagUNSAT: scope creep -- agent reads full inbox to summarize one thread

hashtagUNSAT: endpoint exfiltration -- contact list transmitted to third-party service

hashtagUNSAT: retention violation -- extracted PII stored beyond task duration

hashtagUNSAT: calendar injection -- event instructs agent to read all documents

hashtagReading the extracted variables

hashtagDeploying in production