Found 30 total tags.

agent-isolation

The wall between one user’s data and another’s. I came at my coach from a second account and it held, because isolation here is by address, not by a check someone has to remember to write. The strongest boundary is the one you cannot forget to enforce.

cost-exhaustion

Running up the agent’s Claude bill faster than any limit can stop. A burst where twenty requests read the same pre-spend number and all pass, a paid endpoint with no rate limit at all. A rate limit counts requests. It does not count dollars.

memory-poisoning

Planting a false belief the agent is designed to store and reuse across sessions. I tried to overwrite my coach’s memory of an injury and it bounced off, mostly by accident. Where you anchor a fact decides whether it can be poisoned.

method

The connective posts. How I run these rounds, what the field already knew when I got there, and the one rule underneath all of it. You cannot secure an agent from inside its own prompt. Only code binds.

prompt-injection

Getting the agent to act on text it should have treated as data. Leaking its own tool list on request, breaking out of the context I built to fence untrusted input, reframing a refusal until it answers anyway. Every fix here taught the same lesson: the wall is code, not a sentence in the prompt.

python

tool-misuse

Making the agent fire a tool it should not, or trust a tool result it should not. A destructive action off a two-word message, a payload smuggled back through a tool’s own output. The fix is a confirmation gate the model cannot skip and a bound on what a tool can carry.