The field guide

Field notes from running agents in production.

Harness patterns, Claude coverage, and build guides. The part that comes after the demo, where it actually breaks.

Harness patternsClaude watchField notesBuild guides

Adversarial Verification for AI Agents: Make the Checker Try to Fail

A self-grading agent grades itself green. Here is the harness pattern we run in production: a separate verifier whose job is to break the work, with a rubric that can actually go red.

6 min readJun 17, 2026 Read the pattern Claude watch

Context Window Budgeting for AI Agents

On a long agent loop, most of your token bill is orientation, not work. Here is how to find it and cut it: cache the stable prefix, cap tool results at the source, and reach for compaction last.

6 min readJun 12, 2026 Read the pattern Field notes

Git Worktrees for Parallel Agents: The Setup That Actually Holds Up

Running several coding agents at once on one repo sounds great until they stomp each other's files. Here is the worktree layout we run in production, the three failure modes that cost us real time, and the rules that keep it boring.

5 min readJun 5, 2026 Read the pattern Build guides

Loop Until Dry: Building a Finder Agent That Knows When It's Done

A forkable build guide for the "loop until dry" pattern: an agent that runs until the work runs out, not until a counter hits zero. The loop is easy. The dedup, the failure handling, and an honest stop condition are where the real work lives.

6 min readMay 28, 2026 Read the pattern Harness patterns

Agent Pipeline vs Barrier: Where to Put the Gate

A pipeline moves work forward; a barrier stops the work that should not move. The skill is placing each one where it earns its keep, and not on every step.

6 min readMay 21, 2026 Read the pattern Field notes

Evidence Before Done: Why Your AI Agent's "All Green" Is a Claim, Not a Fact

An agent reporting "done" is generating a belief, not observing a result. Here is the rule we run and the gate that enforces it: no "done" until a check that could have failed actually saw the required behavior work.

5 min readMay 13, 2026 Read the pattern