Claude Code Guardrails

Three agentic hooks that keep Claude Code honest — catching hallucination, intent drift, and context exhaustion in real time.

LLMs hallucinate, drift from goals, and degrade as context fills up. These hooks detect all three failure modes and inject warnings directly into the conversation so Claude self-corrects before wasting compute.

The Problem

Claude Code is powerful, but it fails in predictable ways:

Invents file paths that don't exist, then tries to read them repeatedly
Drifts from your goal -- you ask for a bug fix, Claude starts refactoring six files
Loops on the same failed action 3, 4, 5+ times expecting different results
Gets worse as context fills -- the more tokens consumed, the higher the hallucination rate

By the time you notice, Claude has wasted 10+ tool calls on phantom files, wandered into scope creep, and the context window is polluted.

The Solution

Three complementary hooks that form a layered defense:

1. Intent Drift Detector (`hooks/intent_drift_detector.py`) -- AGENTIC

The headline hook. This is genuinely agentic: an AI watching another AI for drift.

Two-phase operation:

Phase 1 (UserPromptSubmit): Captures the user's prompt and calls a local LLM to extract a concise intent summary
Phase 2 (PostToolUse): Every 4th tool call, sends the intent + recent actions to the LLM to judge alignment

The LLM returns one of three verdicts:

Verdict	Meaning	Hook Response
ALIGNED	Actions serve the goal	No warning
DRIFTING	Tangentially related, losing focus	Warning: re-read the original request
OFF_TRACK	Actions unrelated to the goal	Alert: STOP and course-correct

Example warning:

WARNING INTENT DRIFT
  Goal: "Fix the authentication bug in the login endpoint"
  Status: DRIFTING — Recent actions focus on refactoring logging configuration,
    not the auth endpoint.
  Recent actions: 12 tool calls since last prompt

  ACTION: Re-read the user's original request before your next action.
  Ask yourself: does what I'm about to do directly serve the goal?

How it works:

Calls a local LLM (llama.cpp, Ollama, or any OpenAI-compatible endpoint) for both intent extraction and drift judgment
Configurable endpoint via INTENT_DRIFT_LLM_URL (defaults to http://localhost:8080)
Only checks every 4th tool call to minimize LLM overhead
3-minute cooldown between warnings
Resets intent tracking on each new user prompt (short prompts like "yes" or "ok" are skipped)
Per-session state in /tmp/ survives across hook invocations
Pure Python stdlib -- no pip dependencies (uses urllib.request for LLM calls)

Without a local LLM: The hook silently degrades -- if the LLM endpoint is unreachable, it skips the check and returns {"continue": true}. No errors, no noise. You can run the other two hooks standalone.

2. Hallucination Detector (`hooks/hallucination_detector.py`)

A PostToolUse hook that tracks tool failure patterns across a sliding window to detect three hallucination signals:

Signal	What it detects	Threshold
Phantom files	Read/Glob/Grep failures spike -- Claude is inventing paths	50%+ failure rate over last 12 calls
Action loops	Same tool + same args repeated consecutively	3+ identical calls in a row
Drift zone	High failure rate + high token usage = hallucination territory	Failures + >65% context used

Example output:

HALLUCINATION RISK DETECTED
  Phantom files: 4/6 recent file operations failed (67%). Paths may be invented.
    /src/utils/nonexistent_helper.py
    /lib/config/phantom_module.ts
  Action loop: 'Read:/src/missing.py' repeated 3x. Stuck in a retry loop.

  ACTION: Verify claims against actual tool output. Do not trust file paths or
  function names from memory -- re-read the source before referencing it.

In drift zone (failures + high token pressure):

  DRIFT ZONE: High token usage + failures = likely hallucinating.
  ACTION: Run /compact or restart the session. Do NOT continue --
  outputs are unreliable. Save important state first.

How it works:

Maintains a per-session sliding window of the last 12 tool calls in /tmp/
Detects both hard failures (tool errors) and soft failures ("no such file", "not found", "no matches")
Reads actual token counts from the Claude Code transcript JSONL
2-minute cooldown between warnings
Zero dependencies -- pure Python stdlib

3. Context Window Monitor (`hooks/context_window_monitor.py`)

A companion PostToolUse hook that tracks raw token consumption and warns before you hit the wall:

Level	Threshold	Action
Warning	225K tokens (configurable)	"Start wrapping up -- save state and prepare for session restart"
Critical	256K tokens (configurable)	"STOP and restart NOW. Save important state first."

How it works:

Reads real API usage from the session transcript (not estimated -- actual cache_read_input_tokens + input_tokens + cache_creation_input_tokens)
Thresholds configurable via environment variables
Efficient: reads only the last 50KB of the transcript file

How They Work Together

User sends prompt
  └─ Intent drift detector captures goal via LLM (Phase 1)

Claude works...
  ├─ Tool calls succeed → no warnings
  ├─ Tool calls fail → hallucination detector tracks failure rate
  ├─ Every 4th tool call → intent drift detector judges alignment via LLM (Phase 2)
  └─ Every tool call → context monitor checks token count

Failure cascades:
  Context fills → context monitor warns → "wrap up soon"
  Failures spike → hallucination detector → "verify your claims"
  Failures + tokens → "DRIFT ZONE — restart session"
  Actions diverge → intent drift detector → "you've wandered from the goal"

Three layers, three failure modes, one defense system.

Installation

Quick Setup

# Copy hooks
mkdir -p .claude/hooks
cp hooks/hallucination_detector.py .claude/hooks/
cp hooks/context_window_monitor.py .claude/hooks/
cp hooks/intent_drift_detector.py .claude/hooks/

Add to .claude/settings.json:

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "type": "command",
        "command": "python3 .claude/hooks/intent_drift_detector.py",
        "timeout": 12000
      }
    ],
    "PostToolUse": [
      {
        "type": "command",
        "command": "python3 .claude/hooks/hallucination_detector.py",
        "timeout": 5000
      },
      {
        "type": "command",
        "command": "python3 .claude/hooks/context_window_monitor.py",
        "timeout": 3000
      },
      {
        "type": "command",
        "command": "python3 .claude/hooks/intent_drift_detector.py",
        "timeout": 12000
      }
    ]
  }
}

Note: intent_drift_detector.py appears twice -- it handles both UserPromptSubmit (captures goal) and PostToolUse (judges alignment). Same file, different behavior based on the hook event.

Global Setup (all projects)

mkdir -p ~/.claude/hooks
cp hooks/*.py ~/.claude/hooks/
# Add the above config to ~/.claude/settings.json

Without a Local LLM

The intent drift detector requires a local LLM endpoint for judging alignment. Without one, it silently degrades (no errors, no warnings). The other two hooks work independently with zero dependencies.

To set up a local LLM:

llama.cpp: ./llama-server -m model.gguf --port 8080
Ollama: Set INTENT_DRIFT_LLM_URL=http://localhost:11434
Any OpenAI-compatible API: Set INTENT_DRIFT_LLM_URL to your endpoint

Configuration

Intent Drift Detector

Environment Variable	Default	Description
`INTENT_DRIFT_LLM_URL`	`http://localhost:8080`	Local LLM endpoint (OpenAI-compatible)
`INTENT_DRIFT_LLM_KEY`	`ucis-internal`	API key for the LLM
`INTENT_DRIFT_COOLDOWN`	180	Seconds between drift warnings
`INTENT_DRIFT_WINDOW`	8	Recent tool calls sent to LLM for judgment
`INTENT_DRIFT_MIN_TOOLS`	5	Minimum tool calls before first check
`INTENT_DRIFT_TIMEOUT`	10	LLM call timeout in seconds

Hallucination Detector

Variable	Default	Description
`WINDOW_SIZE`	12	Number of recent tool calls to analyze
`FAILURE_THRESHOLD`	0.5	Failure rate that triggers warning (50%)
`LOOP_THRESHOLD`	3	Consecutive identical calls before loop detection
`TOKEN_PRESSURE_PCT`	0.65	Context usage % that triggers drift zone
`COOLDOWN_SECONDS`	120	Minimum seconds between warnings

Context Window Monitor

Environment Variable	Default	Description
`CONTEXT_WARN_THRESHOLD`	225000	Token count for early warning
`CONTEXT_CRITICAL_THRESHOLD`	256000	Token count for critical/stop warning

Origin

These hooks were built for UCIS (Unified Consciousness Integration System), a domain-separated AI consciousness architecture running 8 autonomous agents, 18 MCP servers, and 26,000+ memories across three graph databases. In that environment, Claude Code sessions routinely hit context limits during deep architecture sweeps, hallucination-induced phantom file loops wasted significant compute, and long agentic tasks would drift from the original goal. These hooks eliminated all three problems.

Requirements

Python 3.10+
Claude Code CLI
No pip dependencies (pure stdlib)
Optional: local LLM endpoint for intent drift detection (llama.cpp, Ollama, or any OpenAI-compatible API)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
hooks		hooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
settings.json		settings.json
summary.md		summary.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claude Code Guardrails

The Problem

The Solution

1. Intent Drift Detector (`hooks/intent_drift_detector.py`) -- AGENTIC

2. Hallucination Detector (`hooks/hallucination_detector.py`)

3. Context Window Monitor (`hooks/context_window_monitor.py`)

How They Work Together

Installation

Quick Setup

Global Setup (all projects)

Without a Local LLM

Configuration

Intent Drift Detector

Hallucination Detector

Context Window Monitor

Origin

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Claude Code Guardrails

The Problem

The Solution

1. Intent Drift Detector (hooks/intent_drift_detector.py) -- AGENTIC

2. Hallucination Detector (hooks/hallucination_detector.py)

3. Context Window Monitor (hooks/context_window_monitor.py)

How They Work Together

Installation

Quick Setup

Global Setup (all projects)

Without a Local LLM

Configuration

Intent Drift Detector

Hallucination Detector

Context Window Monitor

Origin

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Intent Drift Detector (`hooks/intent_drift_detector.py`) -- AGENTIC

2. Hallucination Detector (`hooks/hallucination_detector.py`)

3. Context Window Monitor (`hooks/context_window_monitor.py`)

Packages