All posts

Putting access controls around ChatGPT: data masking for AI coding agents (on Postgres)

Many assume that simply limiting a language model’s API key is enough to keep database data safe, but data masking requires more than token control. In reality, a model that can run arbitrary SQL against a production PostgreSQL instance will see raw rows unless something explicitly filters the result set. When engineering teams hand over a shared credential to an AI coding agent, the model inherits the same unrestricted view as any human operator. The agent can retrieve customer PII, financial

Free White Paper

AI Model Access Control + Data Masking (Static): The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

Many assume that simply limiting a language model’s API key is enough to keep database data safe, but data masking requires more than token control. In reality, a model that can run arbitrary SQL against a production PostgreSQL instance will see raw rows unless something explicitly filters the result set.

When engineering teams hand over a shared credential to an AI coding agent, the model inherits the same unrestricted view as any human operator. The agent can retrieve customer PII, financial figures, or internal secrets with a single SELECT. Because the request travels directly from the model to the database, there is no point where the response can be inspected, redacted, or logged. The result is a blind spot: the organization cannot prove what data left the database, nor can it prevent accidental exposure of sensitive fields.

This lack of visibility is the starting state for many AI‑assisted development pipelines. The workflow looks like this: a developer writes a prompt, the prompt is sent to ChatGPT, the model constructs a SQL query, the query is executed against Postgres using a static user, and the raw result streams back to the model. No audit trail, no inline redaction, no approval step. The organization hopes that the model will behave, but the technical controls simply aren’t there.

Why data masking matters for AI coding agents

Data masking substitutes or removes sensitive values in a response before they reach the consumer. For an AI coding agent, masking protects three critical assets:

  • Customer privacy. Fields such as email, SSN, or credit‑card numbers must never be exposed to a model that could inadvertently store them in its context.
  • Intellectual property. Proprietary schema details or configuration secrets can be reverse‑engineered from raw query output.
  • Compliance evidence. Regulations often require that any access to protected data be logged and that the data be filtered when used for non‑production purposes.

Without a dedicated enforcement point, teams cannot guarantee that these protections are applied. The model’s request still reaches the database directly, and any attempt to mask data after the fact would require modifying the model itself – an impractical and fragile approach.

Introducing a gateway for data masking

To close the gap, the enforcement must sit on the data path – the exact place where the SQL traffic flows. hoop.dev provides a Layer 7 gateway that inspects the protocol, applies policies, and then forwards the request. By positioning hoop.dev between the AI agent and PostgreSQL, the system gains three decisive capabilities:

  • hoop.dev intercepts every query and response, ensuring that no data bypasses the masking logic.
  • hoop.dev applies configurable masking rules to column values before the result is returned to the model.
  • hoop.dev records the full session, providing an immutable audit trail for every AI‑initiated query.

These outcomes exist only because hoop.dev sits on the data path. The identity system that authenticates the model – typically an OIDC token – decides who may start a session, but it does not enforce what the session can see. hoop.dev is the only component that can guarantee that masking happens consistently.

How hoop.dev enforces data masking

When an AI coding agent initiates a connection, it presents an OIDC token that identifies the requestor. hoop.dev validates the token, extracts group membership, and checks the request against a policy that lists which columns are considered sensitive for the target database. If the policy marks a column as sensitive, hoop.dev rewrites the response, replacing the original value with a placeholder such as three asterisks. The rewrite occurs before the data ever reaches the model, so the model never sees the raw value.

Continue reading? Get the full guide.

AI Model Access Control + Data Masking (Static): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Because hoop.dev operates at the PostgreSQL wire‑protocol level, it can apply masking to any client – psql, an ORM, or a custom script – without requiring code changes. You define the masking rules once in hoop.dev’s configuration and the gateway enforces them uniformly for every request that passes through.

Just‑in‑time approval and session replay

In addition to masking, hoop.dev can require a human approver for queries that touch high‑risk tables. When a request matches a “requires approval” rule, hoop.dev pauses execution and routes the request to an approval workflow. Once approved, the query proceeds and the response is masked as described. hoop.dev stores every step – the approval decision, the original query, the masked response – in a replayable session log. Auditors can later replay the exact interaction to verify that the masking policy was applied correctly.

Deploying the solution

Setting up hoop.dev involves three logical steps:

  1. Deploy the hoop.dev container or Kubernetes pod near the PostgreSQL instance. The deployment bundle includes a network‑resident agent that holds the database credentials, so no user ever sees them.
  2. Register the PostgreSQL target in hoop.dev, defining the host, port, and the credential that hoop.dev will use to connect.
  3. Configure OIDC authentication and create masking policies that identify the columns to be redacted for each data‑sensitivity classification.

The official getting‑started guide walks you through the required manifests, the OIDC provider setup, and the policy language for masking. Because hoop.dev is open source, you can also inspect the implementation on GitHub to confirm that it meets your internal security standards.

Benefits at a glance

  • Consistent data masking. Every query response is filtered before it reaches the AI model.
  • Full audit trail. hoop.dev records each session, enabling replay and forensic analysis.
  • Just‑in‑time approval. High‑risk queries are gated behind a human decision.
  • Zero credential exposure. The AI agent never receives the database password; hoop.dev handles authentication.

These benefits flow directly from placing the enforcement in the data path. Without that placement, the same policies could be described, but they would never be enforced because the request would bypass the control point.

Frequently asked questions

Does hoop.dev add latency to database queries?

hoop.dev processes traffic at the protocol level, which introduces only a small, predictable overhead. For most workloads the impact is negligible compared to the security guarantees of masking and audit.

Can I use hoop.dev with existing CI/CD pipelines?

Yes. Because hoop.dev presents a standard PostgreSQL endpoint, any tool that can connect to PostgreSQL can be pointed at hoop.dev without code changes. Your pipeline simply updates the connection string to the hoop.dev address.

Is the masking logic configurable per table or per column?

Masking policies are expressed in a rule language that lets you target specific schemas, tables, and columns. You can combine static placeholders with tokenization or hashing, depending on the sensitivity of the data.

Next steps

Start by reviewing the getting‑started guide to deploy hoop.dev in your environment. Then explore the feature documentation for details on defining masking rules and approval workflows. When you are ready to see the code, the full open‑source repository is available on GitHub:

Explore the hoop.dev repository

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts