All posts

Approval workflows for AI agents on EKS

An AI agent decides the fastest fix for a failing rollout is to delete the namespace and let the controller rebuild it. Technically defensible. Operationally a disaster if the agent is wrong about what is in that namespace. The question is not whether the agent can reason; it is whether a destructive action on production should ever execute without a human seeing it first. Approval workflows answer that: the risky command pauses at the boundary and waits for a person. Approval workflows for AI

Free White Paper

AI Agent Security + Access Request Workflows: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

An AI agent decides the fastest fix for a failing rollout is to delete the namespace and let the controller rebuild it. Technically defensible. Operationally a disaster if the agent is wrong about what is in that namespace. The question is not whether the agent can reason; it is whether a destructive action on production should ever execute without a human seeing it first. Approval workflows answer that: the risky command pauses at the boundary and waits for a person.

Approval workflows for AI agents on EKS put a human decision in front of the operations that deserve one, without slowing the routine ones to a crawl.

Not every command, just the ones that matter

Gating everything an agent does would make it useless, so the goal is selective. Read operations and routine restarts flow through. Destructive or high-blast-radius operations, deleting resources, scaling to zero, exec into a sensitive pod, stop and wait for approval. The control has to recognize the difference and pause only where it counts. Get this wrong in the broad direction and you train your approvers to rubber-stamp; get it wrong in the narrow direction and the dangerous commands slip past.

Where the approval workflows gate sits

hoop.dev is an open-source Layer 7 access gateway. Its kubernetes-eks connector proxies kubectl and exec to the cluster through a network-resident agent that assumes a configured IAM role, the EKS_ROLE_ARN, mapped to Kubernetes RBAC. Because every command crosses the gateway before it reaches the cluster, the gateway is the right place to hold a risky one for approval. The agent cannot route around it, because the gateway is the only path the agent has to EKS.

Building the workflow

  1. Configure the EKS connection with cluster URL, region, cluster name, and the scoped IAM role ARN.
  2. Define which operations require approval, for example any delete or exec in production namespaces.
  3. Set the approver group and how many approvals a request needs.
  4. Point the agent at the gateway. When it issues a gated command, the gateway pauses and notifies approvers.
  5. Verify by having the agent attempt kubectl delete namespace test and confirming it blocks pending approval, then runs only after a human approves.
request: agent-oncall wants
  kubectl delete namespace payments-staging
status:  PENDING APPROVAL -> approver: sre-oncall
on approve: command dispatched and recorded
on deny:    command never reaches the cluster

The deny path is as important as the approve path. When a human declines, the command simply never reaches EKS. There is no half-executed state to clean up, because the gateway held the request before dispatching it to the cluster, not after.

Continue reading? Get the full guide.

AI Agent Security + Access Request Workflows: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

The approval itself becomes part of the record, which changes the conversation after the fact. When a destructive command runs, you do not just know the agent issued it; you know which person signed off and when. That turns a vague "the agent deleted the namespace" into "the agent requested the delete, the on-call SRE approved it at 02:14, and here is the recorded session." Accountability lands on a human decision at the exact point the risk was real, instead of dissolving into the agent's autonomy. That is what an approval gate buys beyond simply slowing a command down.

Pitfalls

  • Do not gate everything. Approval fatigue trains people to rubber-stamp, which is worse than no gate.
  • Do not let the agent be its own approver. The approver has to be an authority separate from the requester.
  • Do not forget to record the decision. Who approved a destructive action is part of the audit trail, not a footnote.

The argument

One model trusts the agent's judgment on every action equally. The other lets routine work flow and stops the destructive ones for a human. approval workflows are the second, and hoop.dev puts the gate on the one path the agent has to the cluster. Start with the getting started guide and read how approvals compose with recording and scoping.

FAQ

Do approval workflows block every command? No. You define which operations require approval; routine commands pass through.

Can the agent bypass the gate? No. The gateway is the agent's only path to EKS, so a gated command cannot route around it.

Is the approval recorded? Yes. The request, the decision, and the approver are captured with the session.

hoop.dev is open source. Get the code at github.com/hoophq/hoop and put a human in front of the EKS operations that warrant one.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts