Runtime policy enforcement for AI coding agents

Instruction files that actually block.

Your assistant reads CLAUDE.md and ignores it when it matters. Arai turns the instruction files you already have — CLAUDE.md, AGENTS.md, .cursorrules — into enforcement: prohibitions block the tool call before it runs, and a tamper-evident audit trail proves, per rule, whether the model obeyed. Nothing to rewrite. No new format. One command. Local. Zero cost.

$ curl -sSf https://arai.taniwha.ai/install | sh click to copy

Why not just write a CLAUDE.md?

An instruction file aloneWith Arai
Advice the model can skip under pressureProhibitions deny the tool call at the hook
No record of what was ignoredHash-chained audit log; arai audit --verify
You hope it listenedPer-rule obeyed / ignored / unclear verdicts
Rewrite your rules into a new policy formatYour existing files are the policy

Supported Environments

Enforcement strength depends on the assistant's integration surface:

Claude Code Native PreToolUse hooks — full blocking support
Grok TUI Native hooks — full blocking support
Cursor, Windsurf, Cline & others MCP integration — strong advisory enforcement
GitHub Copilot Instruction file ingestion only

Rules fire when they matter — and actually block

You: "Create a new database migration"

PreToolUse: Write migrations/versions/001_add_users.py
→ Arai: deny
  reason: "Alembic never: hand-write migration files"
        [from your rules:12, layer-1 imperative]

Assistant: "I should use alembic revision --autogenerate instead..."

And then prove it — per-rule compliance, locally

$ arai stats --by-rule --since=7d

Per-rule compliance
  fires  obeyed  ignored  ratio  rule
     12      11       1    92%  alembic must_not: hand-write migrations
      9       9       0   100%  cargo always: test before commit
      7       4       3    57%  git must_not: --no-verify  

# PostToolUse correlation produces obeyed / ignored / unclear verdicts per rule.
# Tamper-evident JSONL audit log — SHA-256 hash-chained; `arai audit --verify` proves it.
# No SIEM. No data egress. grep + jq, done.

Built around outcomes

Stop bad actions before they run

Prohibitions deny the tool call at the hook — before the file is written or the command runs.

Block, don’t just advise

Rules with never/forbids/must_not can deny the tool call in Claude Code and Grok TUI (the two assistants with native hook support today). always and prefers still advise. Incremental rollout via ARAI_DENY_MODE=off. Cursor, Windsurf and others get advisory enforcement via MCP.

Intent-aware

"Never hand-write migrations" fires on Write but not Edit. Editing existing migrations is fine.

Code graph

tree-sitter scans your codebase. Writing to migrations/ triggers alembic rules even without "alembic" in the file.

Session tracking

"Never push without tests" silences after cargo test runs. Arai remembers what happened this session.

Rule set stays live

Edit CLAUDE.md and Arai re-scans in the background — the next tool call enforces the new wording, no manual rescan. cd into a monorepo subpackage and matching switches to that project’s rules.

Zero noise

Only fires domain-specific rules. Principles already in your instruction files stay silent.

Know what happened — and prove it

A tamper-evident record of every firing, correlated with what the model actually did.

Hash-chained audit log

Every audit-log line carries prev_hash + hash (SHA-256 over canonical bytes); a per-day sidecar anchors the chain tip. arai audit --verify walks every day-bucket and exits non-zero on any tamper, reorder, or deletion. Owner-only on disk (0700/0600 on Unix; icacls-pinned on Windows). Retention is policy, not accident: arai audit --purge --older=90 sweeps whole day-buckets only, so retained days keep a valid chain.

Compliance tracking

Every PostToolUse is correlated against its PreToolUse firings. Each rule gets an obeyed, ignored, or unclear verdict. arai audit --outcome=ignored tells you which rules the model keeps flouting; filter to a specific rule with --rule.

Per-rule compliance ratios

arai stats rolls up the audit log into fires / obeyed / ignored / ratio per rule. Now you can answer "is this rule actually working?" — not "is it firing?" The ⚠ flag highlights low-ratio rules with enough volume to mean it.

Derivation trace

Every firing carries source file, line, and parser layer. Hook output shows the origin (e.g. [CLAUDE.md:42 layer-1] or [AGENTS.md:42 layer-1]) — no more guessing why a rule fired.

Token economics

Repeat firings of the same rule in a session emit a compact one-liner instead of re-injecting the full payload. arai stats surfaces a calibrated tokens saved estimate from suppressed repeats plus denied-and-honored mistakes — secondary signal, primary mission stays correctness.

Ship rule changes safely

Treat your rule set like code: preview, diff, test, roll out incrementally, let stale rules expire.

Explain before you commit

arai why "git push --force" replays a hypothetical tool call through the live match pipeline. Read-only. Ship new rules with confidence.

Preview before commit

arai lint shows exactly which rules a file produces with their classified intent. Iterate on wording without touching the DB.

Diff before save

arai diff shows what an edit would change in the live rule set — added, removed, moved — before you commit it. Pre-commit-hook fodder via --json.

Rule regression tests

arai test replays synthetic hook payloads through the live match pipeline. Catch rule behaviour drift before a real session does. CI-friendly JSON output.

Capture to replay

arai record turns real firings from the audit log into scenario fixtures. You don’t hand-write regression tests — you capture the ones that matter and pin them.

Per-rule rollout

arai severity alembic block pins one rule to deny while the rest of the set stays in advise. Survives arai scan. Ship the set in advise mode, watch which rules earn the trust, then flip them one at a time.

Self-pruning rules

Annotate a rule with (expires 2026-12-31) or (until 2027-06-30). Arai filters it out after the date automatically — perfect for incident-driven rules that have a shelf life.

One source of truth

arai canonicalize extracts your rules into arai.toml; arai sync writes per-tool instruction files from it — CLAUDE.md, AGENTS.md, .cursorrules stay in lockstep instead of drifting.

Scale to teams

Org-wide policy and centralised evidence on your own infrastructure — opt-in, never by default.

Shared policies

Inherit org-wide rules with one directive: arai:extends https://.... Trusted per URL, HTTPS only, cached locally, with @sha256 content pinning and ed25519 signatures. Private policy endpoints work too: arai trust --add <url> --bearer-env VAR sends a bearer token to that exact URL and nowhere else. No policy service — just a markdown file upstream.

Ship the evidence — to your collector

arai audit --ship sends day-buckets with their chain heads to your own HTTPS endpoint, so the hash chain verifies server-side too. Resume cursor, idempotent re-ship, bearer auth via env var. Explicit opt-in only — local-first stays the default.

Self-hosted telemetry

Want the usage signal on your infrastructure? Point [telemetry] endpoint at your own collector — same anonymous events, your retention rules. Opt-outs (ARAI_TELEMETRY=off, DO_NOT_TRACK=1) win regardless. Payload schema documented.

Agent-authored guards

Runs as an MCP server. The agent can register new rules mid-session and Arai enforces them on the next tool call (where supported). arai_recent_decisions lets the agent self-check recent decisions. MCP is the primary integration for Cursor, Windsurf and similar tools (advisory only). Native blocking hooks are currently available in Claude Code and Grok TUI.

MCP authentication

The agent-facing MCP server supports an optional shared-secret via ARAI_MCP_AUTH_TOKEN. When set, initialize must present a matching token (constant-time compare) before any tool call succeeds. Open by default for backwards compatibility.

Runs on your terms

Local-first, fast, verifiable, and embeddable — nothing to monitor, nothing to vendor-onboard.

Local-first, air-gap friendly

No network on the hook hot path. Enforcement, audit, compliance verdicts, and stats all run against the local SQLite + JSONL. Works offline, in restricted environments, and during outages — nothing to monitor, nothing to vendor-onboard.

~30ms latency

End-to-end per tool call, dominated by binary launch. SQLite lookups on the hook path. No network calls. No LLM calls at runtime.

Supply-chain hardened

Downloads verified against SHA-256 checksums.txt on every install path (curl, npm, cargo). arai:extends upstream policy fetches refuse loopback, RFC1918, link-local, cloud metadata, and redirects — and cached upstream files carry a SHA-256 sidecar so an at-rest tamper is detected before the rules reach the parser. MCP-source rules capped per project to bound a malfunctioning agent.

Embeddable library

Arai builds as a Rust library alongside the CLI: parser layers, rule store, guardrail matching, audit chain, and hook decisions as a crates.io dependency. Wrappers and IDE integrations consume enforcement directly instead of shelling out.

Any LLM enrichment

Classify rules via Claude, Ollama, or any LLM CLI. Or use the built-in sentence transformer.

Audit & compliance — controls aligned with SOC 2 TSC

Arai gives you the evidence trail and the controls your InfoSec / procurement team will ask for. Arai itself is not a certified product — the certification is yours to pursue. The controls are designed to align with the SOC 2 Trust Service Criteria:

Full TSC-mapped feature inventory in the compliance procurement doc.

Install

Script
curl -sSf https://arai.taniwha.ai/install | sh
npm
npm install -g @taniwhaai/arai
Cargo
cargo install arai
Homebrew
brew install taniwhaai/tap/arai
Then cd your-project && arai init

Part of the Taniwha family

Arai is the open-source guardrail core of Kete, Taniwha AI’s runtime reliability platform for AI coding agents. Arai handles per-developer enforcement and audit locally, and ships the self-hosting primitives — private rule sources via authenticated arai:extends, audit shipping to your own collector, a configurable telemetry endpoint — for teams that want to assemble centralisation themselves. Kete is the managed layer on top: rule distribution without running an endpoint, aggregated compliance dashboards across a fleet of developers, semantic enrichment, and impact analysis across callers and transitive dependents. The local audit and verdict pipeline doesn’t change either way. If your instruction files just need enforcing on one machine, Arai is all you need. For the full feature inventory mapped to procurement-review questions, see the compliance inventory.