The guardrails most teams write for Devin fail the same way. They are prompt instructions, and an autonomous agent that improvises will eventually route around an instruction or hit a state nobody wrote a rule for. The result is the failure: Devin runs a destructive command against a real system because the only thing stopping it was a sentence in its context. Real guardrails are enforced outside the agent, on the path it uses to reach infrastructure.
Start from the failure, because it tells you exactly where the control has to live. The lesson is not "write better prompts." It is "stop relying on the prompt for enforcement at all."
How prompt-level guardrails fail
A prompt rule has no enforcement point: nothing checks it at the moment the command actually fires. It has no memory of edge cases the author did not foresee. And the agent can be led past it by a plausible-sounding task that the author never imagined. When Devin drops a table or reads a sensitive table, the post-mortem usually finds the guardrail was advisory all along, a comment the agent was free to disregard.
The same weakness applies to anything that runs before execution: a pre-commit hook, a review step, a static check. They guard the code. They say nothing about what the agent does at runtime against a live database.
The fix: guardrails on the connection
hoop.dev is an open-source Layer 7 access gateway. Devin reaches a database or internal service through it, and the guardrails are enforced on that connection, where the agent cannot edit them:
- Access is just in time and scoped per connection, so there is no broad standing credential to abuse.
- Destructive operations route for human approval and block until someone responds.
- Every command is recorded at the gateway, outside Devin, with its identity attached.
- Sensitive fields are masked inline on supported databases.
One model asks the agent nicely. The other refuses the command at the boundary. Guardrails that hold are the second kind. hoop.dev governs the connection, not Devin itself, and does not read the agent's prompts. You can see how the boundary works on the hoop.dev site.
Setting up enforced guardrails
- Run the hoop.dev agent beside the target and register the connection.
- Scope Devin's identity to least privilege for the task.
- Set an approval rule so writes and deletes pause for a named approver.
- Confirm recording is on, then run a write and watch it wait for approval instead of executing.
Pitfalls to avoid
Do not layer prompt instructions on top and call the job done; they are advisory and the agent can ignore them. Do not give Devin a credential outside the gateway, because that path has no guardrails at all. And do not set one blanket approval rule for every connection; match the rule to the risk of each target.
There is a reason this matters more for an autonomous agent than for a human at a terminal. A human pauses, second-guesses, and notices when a command feels wrong. Devin executes with the same confidence on its thousandth command as its first, and it does not get tired or cautious near the end of a long task. The guardrail cannot assume any judgment on the agent's part, because the agent's whole value is that it acts without waiting to be told each step. The control has to be external precisely because the thing it governs is tireless and literal.
That framing also tells you which operations deserve an approval gate. Anything irreversible or wide-blast belongs behind one: schema changes, bulk deletes, writes to tables other systems depend on. Reads and narrow, reversible writes can flow without friction. The goal is not to slow the agent down everywhere, only to put a human in the loop where a mistake is expensive and cannot be undone.
FAQ
Why not just instruct Devin not to run dangerous commands?
Because nothing enforces an instruction. A connection-level rule blocks or pauses the command itself, regardless of what the agent decides.
Are these guardrails inside Devin?
No. They run on the connection at the gateway. Devin reaches infrastructure through hoop.dev and cannot disable controls there.
Is hoop.dev open source?
Yes, it is MIT licensed.
Enforce guardrails Devin cannot talk past. Start at the hoop.dev GitHub repository and set your first approval rule with the getting started guide.