Putting access controls around ChatGPT: guardrails for AI coding agents (on on-prem)

A recently offboarded contractor left behind an automated AI coding agent that could still invoke the on‑prem ChatGPT service to generate and run code. Without guardrails, the organization had no control over the agent's behavior. The team discovered that the agent was able to pull secrets from a configuration store and launch commands on production VMs, all without any human oversight. The incident highlighted a gap: the AI assistant had unrestricted access to internal resources, and there was no way to see what it did or to stop dangerous actions.

When an AI model is used as a coding assistant inside a private network, the same risk applies to every deployment. The model can be prompted to retrieve data, modify databases, or start processes that affect availability and confidentiality. Without a control plane that inspects each request, organizations cannot guarantee that the model respects policy, that sensitive output is hidden, or that every action is recorded for later review.

Why guardrails matter for ChatGPT agents

Guardrails are the set of runtime policies that enforce least‑privilege, data‑masking, command approval, and audit at the moment an AI request reaches a target system. They protect three core concerns:

Data leakage prevention: The model’s responses may contain passwords, API keys, or personally identifiable information. Inline masking removes those fields before they reach downstream services.
Command safety: A generated script could include destructive commands. Real‑time blocking stops those commands from ever executing.
Visibility and accountability: Every interaction is recorded so that security auditors can replay the session and verify compliance.

Implementing guardrails requires three distinct layers.

Setup: identity and least‑privilege for AI agents

The first layer is identity provisioning. The AI coding agent should not use a human credential. Instead, create a dedicated service account in the organization’s identity provider (Okta, Azure AD, Google Workspace, etc.). The account receives an OIDC token that encodes the agent’s purpose and the groups it belongs to. By scoping the token to a narrow set of roles, the system ensures the agent can only request the resources it truly needs. This step decides who the request is, but on its own it does not enforce any protection.

The data path: placing a gateway between the agent and infrastructure

The enforcement point must sit on the network path that carries the ChatGPT traffic. A Layer 7 gateway intercepts the protocol, inspects the payload, and applies policy before the request reaches the target database, container runtime, or SSH host. Because the gateway sits outside the agent’s process, the agent cannot tamper with the checks.

At this stage the architecture looks like:

Continue reading? Get the full guide.

AI Guardrails + Single Sign-On (SSO): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

ChatGPT coding agent authenticates to the gateway using its OIDC token.
The gateway validates the token, extracts group membership, and determines the allowed actions.
All traffic then flows through the gateway before reaching the internal resource.

Enforcement outcomes delivered by hoop.dev

hoop.dev implements the data‑path gateway described above. It records each session, masks sensitive fields in responses, blocks disallowed commands, and routes risky operations to a human approver. Because hoop.dev is the active component in the path, the guardrails exist only because hoop.dev is present.

Specifically, hoop.dev:

captures a complete audit log of every ChatGPT request and response, enabling replay and forensic analysis;
applies inline masking rules so that secrets never leave the gateway in clear text;
evaluates command patterns against a deny list and aborts execution when a violation is detected;
offers just‑in‑time approval workflows, pausing a request until an authorized reviewer grants consent.

All of these outcomes are enforced at the gateway level, meaning the AI agent never sees the underlying credential or the masked data. The agent’s view is limited to what the policy permits.

Getting started with hoop.dev for ChatGPT guardrails

To deploy guardrails for an on‑prem ChatGPT coding agent, follow the high‑level steps outlined in the getting‑started guide. The guide walks you through deploying the gateway container, configuring the OIDC identity provider, and registering the ChatGPT service as a connection. Once the gateway is running, define masking policies and command deny lists in the learning portal. Detailed policy syntax and examples are provided there.

The entire solution is open source, and the source code, issue tracker, and contribution guidelines live in the official GitHub repository. Review the repository to understand the architecture, submit custom policies, or contribute improvements.

Explore the GitHub repository to clone the project, read the documentation, and start the deployment process.

FAQ

What if the AI model tries to exfiltrate a secret?

hoop.dev’s inline masking removes any field that matches a secret pattern before the response leaves the gateway, so the secret never reaches the model’s output stream.

Can I audit who approved a risky operation?

Every approval request and its outcome are logged by hoop.dev. The audit log includes the approving user, timestamp, and the exact request that was approved.

Do I need to change my existing ChatGPT client?

No. The client continues to connect to the same host and port; the gateway transparently proxies the traffic while applying guardrails.