Before an AI agent touches a production Kubernetes cluster, decide where the production access control lives, because the wrong answer is "inside the agent." An agent holding a production kubeconfig is a single process with standing access to your most important workloads and the ability to log its own behavior. This is a setup guide for the other model: production access control enforced at a boundary the agent connects through, so access is brokered per task and recorded where the agent cannot reach it.
What production access control means for an agent
Production access control is the set of rules that decide whether a given identity may run a given command against a given production resource, right now, and that record what happened. For an agent on Kubernetes the resources are the API server, pods, secrets, and deployments. The control has to do four things at once: tie access to the agent's identity, scope it to the task, broker it per session instead of leaving it standing, and record the session outside the agent. A static kubeconfig does none of these.
Setup, step by step
- Put a boundary in front of the cluster. Stop giving agents direct kubeconfigs. Place an access boundary in front of the Kubernetes API and route agent traffic through it. The agent connects to the boundary; the boundary connects to the cluster. hoop.dev is built for this: it is a Layer 7 access gateway and identity-aware proxy that sits in front of the cluster's access path.
- Give each agent its own identity. Register a distinct identity per agent at the boundary, not a shared service account. Production access control depends on knowing exactly which agent is asking.
- Scope access to the task, not the cluster. Define the narrowest set of verbs and resources each agent's tasks require, scoped to a namespace where possible. The grant should cover the task and stop there.
- Broker access per task instead of standing. Configure access to be granted on request, for the task, and to end on its own. Between tasks the agent holds nothing. There is no resting credential to steal.
- Gate the risky verbs. Route high-risk actions, namespace deletes, production
exec, secret reads, through an approval before they proceed, while low-risk reads pass automatically. - Record every session outside the agent. Because traffic runs through the gateway, the full command sequence is captured on the gateway side, outside the agent process, and stored where the agent has no write path.
The getting-started docs for connecting a Kubernetes cluster walk through registering the connection, and the learn pages on per-task access and recording cover scoping and approvals in depth.
The verification step, which is where most setups are actually tested
A configuration you have not tried to break is a configuration you do not understand. Run three checks.
