All posts

Putting access controls around ChatGPT: database access for AI coding agents (on CI/CD pipelines)

A CI/CD pipeline that lets ChatGPT generate code and run queries against production databases, yet leaves no trace of who approved each query, is a recipe for silent data breaches. In the ideal setup every AI‑driven coding step is granted just‑in‑time database access, each statement is recorded, sensitive columns are masked in real time, and any risky command is paused for human review before it reaches the database. That picture eliminates accidental data exfiltration, satisfies audit requireme

Free White Paper

CI/CD Credential Management + AI Model Access Control: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

A CI/CD pipeline that lets ChatGPT generate code and run queries against production databases, yet leaves no trace of who approved each query, is a recipe for silent data breaches. In the ideal setup every AI‑driven coding step is granted just‑in‑time database access, each statement is recorded, sensitive columns are masked in real time, and any risky command is paused for human review before it reaches the database. That picture eliminates accidental data exfiltration, satisfies audit requirements, and keeps the blast radius of a mis‑behaving model tightly bounded.

Teams that rush to enable AI coding agents often fall into three common traps. First, they store a static database password in the repository or in a secret manager that every pipeline job can read. Second, they assign a broad IAM role to the CI runner, giving the job unrestricted read‑write rights across all schemas. Third, they rely on the database’s native logging alone, which captures only the fact that a query ran, not who or why it ran, and it does not hide personally identifiable information that may appear in query results. The combination of permanent credentials, over‑privileged roles, and missing audit creates a blind spot that can be exploited without any alert.

Why database access matters for ChatGPT agents

ChatGPT can synthesize SQL on the fly, but it does so without context about the organization’s data‑privacy policies. When an AI‑generated statement reaches a production instance, it may inadvertently request credit‑card numbers, health records, or internal configuration values. Without a guard that can inspect the payload, mask protected columns, and enforce approval, the model becomes a conduit for data leakage. Moreover, regulatory frameworks such as SOC 2 require evidence that every privileged operation is authorized and traceable. A pipeline that hands out perpetual credentials cannot produce the granular evidence auditors expect.

Common pitfalls when granting database access to AI coding agents

  • Static secrets in code. Embedding a password or a service‑account key in the pipeline definition means every clone of the repo inherits full access.
  • All‑or‑nothing IAM roles. Granting the CI runner a role that can drop tables or export entire databases defeats the principle of least privilege.
  • No real‑time data protection. Relying on post‑query logs leaves sensitive fields visible in transit and in storage.
  • Missing approval workflow. A model can issue a destructive command before anyone notices, because the pipeline proceeds automatically.
  • Absence of session replay. If a query causes unexpected side effects, there is no way to replay the exact interaction for forensic analysis.

Addressing these gaps requires a control point that sits between the CI job and the database, where policies can be applied to each request.

How hoop.dev enforces database access for ChatGPT

The enforcement point is a Layer 7 gateway that proxies every database connection. The gateway holds the database credential, so the CI job never receives the password. Identity is verified by an OIDC token that the pipeline obtains from its cloud provider or CI system; the token verifies which job is running and which group it belongs to. That identity information is consumed by the gateway, which then decides whether the request may proceed.

Once the request reaches the gateway, hoop.dev inspects the SQL payload at the protocol level. If the statement contains a column marked as sensitive, hoop.dev masks the value before it is returned to the caller. If the command matches a risk pattern, such as a DROP DATABASE statement or a bulk export, hoop.dev can block the command outright or route it to a human approver. Every interaction is recorded, and the recording can be replayed later for audit or forensic purposes.

Because the gateway is the only place where traffic is visible, all enforcement outcomes, masking, blocking, just‑in‑time approval, session recording, exist only because hoop.dev sits in the data path. Removing the gateway would instantly eliminate those protections, even though the identity token and IAM role remain unchanged.

Continue reading? Get the full guide.

CI/CD Credential Management + AI Model Access Control: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

For CI/CD pipelines, the typical flow looks like this:

  1. The pipeline starts and requests a short‑lived OIDC token that represents the job’s service identity.
  2. The token is presented to hoop.dev when the job opens a database client such as psql or mysql.
  3. hoop.dev validates the token, extracts group membership, and checks the policy that governs which databases the job may touch.
  4. If the policy allows access, hoop.dev establishes a connection to the target database using its own stored credential.
  5. Every SQL statement passes through hoop.dev, where masking, risk checks, and optional approval steps are applied.
  6. The gateway streams the filtered response back to the job, while simultaneously writing a session log that can be replayed later.

This architecture satisfies the three ingredients mentioned earlier: identity proves who is running, the gateway is the choke point that can see and modify traffic, and the policy engine enforces the desired guardrails.

Putting it together in CI/CD pipelines

Start by configuring your CI system to request short‑lived OIDC tokens for each job. Most modern platforms support this out of the box and allow you to map tokens to groups that reflect the job’s purpose (e.g., migration or feature‑test). Next, deploy hoop.dev’s gateway inside the same network segment as the target database. The quick‑start guide walks you through a Docker‑Compose deployment that includes the gateway, an agent that runs close to the database, and the OIDC verification component.

When you register a new database connection in hoop.dev, you provide the host, port, and a credential that the gateway will use. The credential never leaves the gateway, so the CI job does not need to know it. In the connection definition you also declare which columns are considered sensitive; hoop.dev will automatically redact those fields in query results.

Define policies that bind OIDC groups to allowed operations. For example, a feature‑test group may read but not write, while a migration group may write but must obtain explicit approval for any DROP or ALTER statement. Policies can also set time‑bounds, ensuring that a job can only hold a connection for a few minutes before it is automatically revoked.

With the gateway in place, any attempt by ChatGPT‑driven code to run a query will be funneled through hoop.dev. If the query tries to access a masked column, the response will contain a placeholder instead of the real value. If the query matches a high‑risk pattern, the request will pause and a designated reviewer will receive an approval prompt. All of these actions are recorded in a replayable session log, giving you the evidence needed for compliance audits.

Because the enforcement happens at the network edge, you do not need to instrument the application code or modify the AI model. The same pipeline can be used for unit tests, integration tests, and production deployments, each with its own fine‑grained policy.

Next steps

Review the getting‑started guide to spin up the gateway and register a PostgreSQL connection. The learn section contains deeper discussions of masking policies, approval workflows, and session replay. When you are ready to try it yourself, clone the open‑source repository at github.com/hoophq/hoop and follow the deployment instructions.

FAQ

  • Do I still need database‑level users? The gateway uses its own credential, so the database sees only a single service account. Application‑level users are not required for the AI‑driven jobs.
  • Can I mask columns in a NoSQL store? Yes. hoop.dev supports MongoDB and other NoSQL connectors, and you can declare sensitive fields in the same way as for relational databases.
  • How does replay work for compliance? Each session is stored as a chronological log of request and response packets. Auditors can replay the exact interaction to verify that masking and approvals were applied correctly.
Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts