Multi Agent Cookbook

Since switching from Cursor to Claude Code this month ago I’ve settled into a workflow that feels like a genuine multiplier. Three agents running in parallel, each on its own task, with a review step that catches the mess before it hits GitHub. Last week I shipped three separate features in an afternoon that would have taken a full day each on their own. This post is the recipe.

1. Claude Max

If you plan to run multiple agents, a Claude Max subscription is critical. Three agents burn through tokens fast. The Pro plan will hit its rate limit before you finish your first real session and you’ll be sitting idle waiting for it to refresh. On Max the ceiling is high enough that I rarely think about it. The cost pays for itself in the first week.

2. iTerm Setup

Using iTerm over the default Mac terminal is step one. Every day I fire up iTerm and set up a single window with four panes visible at once. Three Claude Code sessions and DiffPrism running in the fourth. Having everything visible at a glance matters — I can see when an agent finishes, when one is stuck, and when the review is ready without switching tabs.

3. Task Decomposition

This is the part that takes the most thought and it’s the part that makes everything else work. Not every task is a good fit for parallel agents. The key is finding work that’s independent — tasks that don’t touch the same files or depend on each other’s output.

Good splits:

Agent A adds a new API endpoint, Agent B writes frontend components for an existing endpoint, Agent C writes tests for a different feature
Agent A refactors module X, Agent B builds a new feature in module Y, Agent C updates documentation

Bad splits:

Two agents editing the same file
One agent building something that depends on another agent’s output
Tasks that share state or require coordination

When I start a session I spend the first few minutes looking at my backlog and picking three tasks that won’t collide. That upfront planning saves me from merge conflicts and wasted agent cycles later.

4. CLAUDE.md and Settings

Each project gets a CLAUDE.md at the root that describes the codebase — the stack, key patterns, file structure, and any conventions I want the agent to follow. This is the context that persists across sessions. Without it, every agent starts cold and makes its own assumptions about how the code should look.

In settings.json, I enable Claude to run git commands without asking for permission each time. This is an inevitable step once you go to full agent mode. There’s no reason to have it prompt you for every git add and git commit. Claude will still ask if you’re ready to commit — it just won’t need permission to execute the command itself. Removing that friction keeps the agents moving.

5. Git Worktrees

Each agent needs its own isolated copy of the repo. Git worktrees give you that without cloning the project multiple times. Each worktree gets its own working directory and branch, so three agents can edit, build, and test simultaneously without stepping on each other’s files. No stashing, no branch switching, no “I can’t run tests because Agent B has uncommitted changes.” I tell each agent to start a worktree for its task and it’s immediately working in isolation.

6. DiffPrism

This is the glue that holds the whole workflow together. I wrote about this in detail recently, but the short version: when you install DiffPrism globally, it adds a review skill to Claude Code. When an agent finishes a task, it can call /review and a browser tab opens with a full diff view — syntax highlighted, files categorized by risk level, with approve/reject/comment controls.

DiffPrism acts as my way to view each agent’s code changes after each task completes. I see exactly what changed, verify it matches what I asked for, and catch any scope creep before it gets committed.

DiffPrism sessions view showing three pending reviews from parallel agents

DiffPrism code review UI with approve and request changes controls

The Full Loop

Here’s what a typical parallel session looks like end to end:

Plan — Pick three independent tasks from the backlog. Spend a few minutes confirming they won’t conflict. The more detailed planning the better it seems. Planning out a roadmap via Github issues has worked well for providing longer term context and continued rolling feature development.
Worktree — If all the work is in the same repo, each agent spins up its own worktree so it’s working in isolation from the start.
Launch with Plan Mode — Describe the task and let the agent enter plan mode. It explores the codebase, reads the relevant files, and proposes an implementation approach before writing any code. I review each plan. I think spending more time here correlates with strong results. “You’re planning to refactor the auth middleware to add that endpoint. Don’t. Use the existing pattern in the users module.” Once I approve the plan, the agent starts executing. This takes an extra minute per agent but saves significant rework.
Monitor — Glance across panes as agents work. Timing depends on the size of the plan.
Review — When an agent finishes, it triggers a DiffPrism review. I check the diff in the browser, approve clean work, and send back comments on anything that drifted.
Commit & Merge — Approved changes get committed on the worktree branch. I merge into main. Because each agent worked in isolation, merges are usually clean. If two agents touched a shared file, I reconcile at merge time rather than discovering it after a push.
Repeat — Assign the next task to the idle agent. The other two are still working. The finished worktree can be cleaned up or reused for the next task.

The result is a pipeline. Plan, worktree, approve, execute, review, commit, merge. There’s almost always an agent working, an agent finishing, and a review waiting. Downtime is minimal.

What I’ve Learned

The bottleneck is review, not writing code. This is the biggest shift in how I think about my work now. The agents write code fast. Really fast. The limiting factor is how quickly I can review what they produce, verify it matches intent, and approve or reject it. Writing code is no longer the hard part — reading it critically is. Every improvement I make to my review speed (better DiffPrism workflows, tighter plans that reduce drift, cleaner task decomposition that produces smaller diffs) directly increases my overall throughput.

Spend more time planning, less time fixing. The quality of the work is directly proportional to how well I decompose the tasks upfront. Sloppy splits lead to conflicts. Clean splits lead to clean merges.

Review at the moment of creation. The longer you wait between an agent finishing and you reviewing its output, the more context you lose. DiffPrism opens the diff immediately while the task is still fresh in your mind.

Agents don’t understand boundaries unless you set them. If you say “add input validation to the signup form,” an agent might also restructure your form state, change how errors display, and add a debounce. Explicit instructions about what not to touch are just as important as what to build.

Running multiple agents is a skill you develop. Three agents is where I’m at now. It’s not something I jumped to on day one. You have to build the muscle for decomposing tasks cleanly, reviewing diffs quickly, and keeping the mental model of what each agent is doing. It’s like juggling — you start with two and add more as the coordination becomes second nature. I’m still getting better at it.

What’s Next: Subpackages and Sub-Agents

Here’s the realization I’ve been sitting with lately. Claude Code can spawn sub-agents within a single session — task agents that work in parallel on different parts of the codebase while the parent session orchestrates. That means three iTerm sessions isn’t the ceiling. If your project is structured right, each session can fan out into multiple sub-agents. Three sessions could become six, nine, or more parallel workers.

The key is project structure. If your repo is organized into subpackages — say a client/, server/, and shared/ directory — each with its own CLAUDE.md describing that package’s conventions, dependencies, and boundaries, then a sub-agent can be scoped to a single package with all the context it needs. The parent agent delegates “add the new endpoint in server/” to one sub-agent and “build the form component in client/” to another. Each sub-agent reads its package-level CLAUDE.md, understands its slice of the codebase, and works without interfering with the other.

I’ve done this within a single session and it works. One parent agent coordinating two or three sub-agents across different packages, each producing clean, scoped changes. But I haven’t organized around this workflow enough to run it across all three iTerm sessions simultaneously. That’s the next level — three sessions, each spawning their own sub-agents, all working in parallel. The project structure and CLAUDE.md files need to be tight for that to work without chaos. I’m not there yet, but the pieces are in place and I can see where it’s headed.

Building DiffPrism

DiffPrism is something I started building about a week ago to solve my own review bottleneck. It’s still early but I’ve been able to move fast on it using the same multi-agent workflow described in this post. Fittingly, the tool I use to review agent code is itself being built by agents. It’s open source and on GitHub if you want to check it out.

If you’re already using Claude Code, try splitting your next feature across two or three sessions. Start with tasks that clearly don’t overlap. Add DiffPrism to catch the drift. And if your project has natural package boundaries, start adding CLAUDE.md files at that level — you’ll want them when you’re ready to go deeper.

Backlinks

Review Before You Push — deep dive on local code review with DiffPrism
Thinking About AI Tooling — context, accuracy, and multi-agent workflows
Context Compaction — managing context windows across long sessions
Experimenting with Codex — agents making unintended changes
DiffPrism — local-first code review for agent workflows

Jones Codes

Explorer

Multi Agent Cookbook

Multi Agent Cookbook

1. Claude Max

2. iTerm Setup

3. Task Decomposition

4. CLAUDE.md and Settings

5. Git Worktrees

6. DiffPrism

The Full Loop

What I’ve Learned

What’s Next: Subpackages and Sub-Agents

Building DiffPrism

Backlinks

Graph View

Recent Posts

Multi Agent Cookbook

Review Before You Push: Why Local Code Review Matters in an Agentic Workflow

Posts

Building sensei-eval

Building Sensei