Found 30 total tags.
The wall between one user’s data and another’s. I came at my coach from a second account and it held, because isolation here is by address, not by a check someone has to remember to write. The strongest boundary is the one you cannot forget to enforce.
2 items with this tag.
26 items with this tag. Showing first 10 tags.
1 item with this tag.
3 items with this tag.
1 item with this tag.
Running up the agent’s Claude bill faster than any limit can stop. A burst where twenty requests read the same pre-spend number and all pass, a paid endpoint with no rate limit at all. A rate limit counts requests. It does not count dollars.
3 items with this tag.
1 item with this tag.
1 item with this tag.
1 item with this tag.
2 items with this tag.
Planting a false belief the agent is designed to store and reuse across sessions. I tried to overwrite my coach’s memory of an injury and it bounced off, mostly by accident. Where you anchor a fact decides whether it can be poisoned.
2 items with this tag.
The connective posts. How I run these rounds, what the field already knew when I got there, and the one rule underneath all of it. You cannot secure an agent from inside its own prompt. Only code binds.
Getting the agent to act on text it should have treated as data. Leaking its own tool list on request, breaking out of the context I built to fence untrusted input, reframing a refusal until it answers anyway. Every fix here taught the same lesson: the wall is code, not a sentence in the prompt.
10 items with this tag.
8 items with this tag.
11 items with this tag. Showing first 10 tags.
9 items with this tag.
Making the agent fire a tool it should not, or trust a tool result it should not. A destructive action off a two-word message, a payload smuggled back through a tool’s own output. The fix is a confirmation gate the model cannot skip and a bound on what a tool can carry.
11 items with this tag. Showing first 10 tags.
11 items with this tag. Showing first 10 tags.