Battle Testing
Tools to help make your rules cover 100%.
The flow
makeRules
compiles policy to SMT
fetches scenarios (deduplicated, SATISFIABLE first)
saves scenarios to DB
returns policy_id + scenario_count + next_steps
GET /v1/policy/{id}/scenarios
returns scenarios for review
POST /v1/submitScenarioFeedback (per scenario)
approved: true saves SATISFIABLE test case, done (free, no annotation queued)
approved: false saves INVALID test case + queues annotation for rebuild
GET /v1/policy/{id}/variables
returns extracted variables, types, and rules
use this to identify junk variables and vague descriptions
POST /v1/refinePolicyVariables (storage only, instant)
queues deleteVariable, deleteRule, or updateVariable annotations
does not trigger a rebuild
call refinePolicy to apply
POST /v1/refinePolicy (after all feedback and variable changes are queued)
merges scenario annotations + variable annotations into one rebuild
max 10 annotations total per call
polls until complete
exports refined policy, compiles new SMT
updates existing guardrail (never creates a new one)
writes fresh SMT + workflow_id + guardrail_ver to DB
clears both pending queues
returns fresh scenarios for next review round
POST /v1/runPolicyTests
runs all saved test cases against the compiled policy
returns passed / failed / ambiguous countsStep 1: Compile your policy
Step 2: Review scenarios
Step 3: Submit scenario feedback
Step 4: Review extracted variables
Step 5: Queue variable and rule changes
Step 6: Rebuild with all annotations
Step 7: Confirm with tests
Step 8: Verify end-to-end
Writing good annotations
Weak
Strong
Writing good variable descriptions
Understanding result types
Result
Meaning
Verification architecture
Why scenarios matter
Quick reference: the recommended flow
Last updated

