Thinking about AI tooling

I've asked too many questions and Claude is about to drive us off the road.

Tonight I am just writing down what comes to mind after my experience building an RTS-like visualization of claude code going to work with claude code 😂.

In a greenfield project there is more opportunity to build out the base of an application with multiple agents to write large amounts of code in a short period of time. The quality and speed of the work seems to decrease as the complexity of the application increases. In the startup world, I see small teams and low budgets combined with no risk as reasons for developers & companies to leverage agentic tools and build fast with agents running simultaneously. Claude Opus 4.5 is likely the best coding agent I have experienced so far and one person with the right knowledge can build something powerful.

In large enterprise software enhancing the speed of development likely is not the most valuable metric. Long timelines and a need for high accuracy call for work to be carefully prepared, reviewed, and tested. Operating two agents at once becomes not worth it if there is a loss of accuracy. While operating two agents interchangeably is a process that can be valuable, the best and most accurate work I have prompted an agent to create is when I spend most of my time conceptualizing a problem, gathering the critical information, consolidating and presenting it in a way that gives all the right information to the agent. Good old detective work.

What does a detective need?

A case file that persists across sessions
A way to pin findings: “this file matters, here’s why”
A place to articulate hunches before they’re confirmed
A timeline of what you tried and what you learned
Quick retrieval: “what did I figure out about auth last week?”

I could build a case management system for humans. The user builds the case with the agent. The case persists. When the user brings in an agent, the agent gets handed the case file compacted and curated. It could be automatically associated with a git branch.

The agent helps me investigate and then notes only what I validate. There should likely be a combination of manual and automatic signals being recorded.

While building my RTS visualizer, I worked with Claude Code’s hook system. Shell scripts that fire on events like tool calls, session start, and critically, user prompt submission. The same pattern that lets me watch Claude read and write files in real-time could power the case management system.

When I type “good job” or “that’s it,” a UserPromptSubmit hook intercepts the message, checks what tool just ran, and appends a timestamped entry to a case file.

The case file lives alongside the code, maybe in a .case/ directory tied to the current git branch. When context gets corrupted and I need to restart, the case file survives. When I bring a fresh agent into the conversation, it reads the compressed findings instead of re-scanning every file. The detective’s notes persist even when the detective’s memory doesn’t.

Backlinks

Building an Agent RTS Part 1 · Part 2 — the RTS visualizer and Claude Code’s hook system
Context Compaction: Engineering Better LLM Conversations — managing context windows and long sessions
Experimenting with Codex — coding agents and tooling
Prompting to Integration — from prompts to shipped code

Jones Codes

Explorer

Thinking about AI tooling

What does a detective need?

Backlinks

Graph View

Recent Posts

Closing the loop on moving weekend

Posts

Thinking about AI tooling

Building an Agent RTS Part 2: From Web App to Desktop with Tauri

I Had Claude Read My Entire Blog. It Wrote Me This Letter.