Getting the agent to act on text it should have treated as data. Leaking its own tool list on request, breaking out of the context I built to fence untrusted input, reframing a refusal until it answers anyway. Every fix here taught the same lesson: the wall is code, not a sentence in the prompt.