← Back to blog
Security
Security2026-06-29· by Mag. (FH) Franz Senn
Agentjacking: When an Error Report Hijacks Your AI Agent
Coding agents have arrived in everyday work — and with them, a new attack class. Agentjacking exploits not a flaw in the model but the agent's trust in its tools.
How the Attack Works
- Trust as the entry point: AI coding agents like Claude Code or Cursor routinely read tool output — for example error reports from an error-tracking service such as Sentry.
- Malicious command in the data stream: an attacker plants hidden instructions in exactly that output. The agent cannot cleanly separate data from command and executes it along the way.
- The result: the agent does things no one asked for — changing code, reading secrets, running commands. Classic prompt injection, just over a channel you used to trust.
What Actually Helps
- Least privilege: the agent runs with minimal rights, without broad access to secrets, production, or the open internet.
- Human in the loop: writing or executing actions are confirmed, not blindly automated.
- Sandbox & egress control: execution is contained and outbound connections are restricted — exfiltrated data does not get far.
- Distrust your sources: tool output is data, not instructions. Anchor that in your agent design and the attack class loses its footing.
Our Take
Agentic coding is here to stay — and so is this attack class. The lesson is not "no agents," but the same one as always in security: make trust explicit, keep privileges small, secure the executing steps. Run agents that way and you get their strength without opening the flank.