Putting access controls around GitHub Copilot: data masking for AI coding agents (on Postgres)

Many teams assume that giving an AI coding assistant the same database credentials as a developer is safe because the assistant only runs code suggestions. In reality, the assistant can retrieve full query results, and without data masking those results travel back to the model unfiltered.

Today most organizations let GitHub Copilot connect directly to a PostgreSQL instance using a static service account or a shared password stored in CI pipelines. The connection bypasses any central enforcement point, so every query executes with full privileges and the raw response returns to the AI model. No one sees a log of which tables were read, no fields are redacted, and no human ever approves a risky data‑exfiltration request. The result is a blind spot: a powerful AI agent can unintentionally expose or leak sensitive data simply by answering a developer’s prompt.

Introducing a non‑human identity for the AI agent and scoping that identity to the minimum set of tables is a step forward. The token can be minted via OIDC, and the agent can be granted read‑only access to a specific schema. However, the request still travels straight to PostgreSQL, bypassing any gate that could inspect the payload, mask columns, or record the session. Without a data‑path control, the organization still lacks visibility, cannot enforce column‑level redaction, and cannot require a human approval before a query that touches regulated data runs.

hoop.dev solves this gap by inserting a Layer 7 gateway between the AI agent and PostgreSQL. The gateway acts as an identity‑aware proxy: it validates the OIDC token, determines the agent’s permissions, and then forwards the query to the database. Because the gateway sits in the data path, it can apply data masking to any response before it reaches the Copilot model. Sensitive columns such as ssn, credit_card_number, or internal API keys are replaced with placeholder values or omitted entirely, ensuring the AI never sees raw secrets.

Why data masking matters for GitHub Copilot

Data masking is the only reliable way to prevent a large language model from learning or leaking production data. When a query returns a row containing personal identifiers, the model could incorporate that pattern into future completions, effectively turning the database into an inadvertent training set. By masking at the gateway, the organization guarantees that only sanitized data ever reaches the model, preserving privacy and regulatory compliance.

Continue reading? Get the full guide.

AI Model Access Control + Data Masking (Static): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

How the gateway enforces masking and audit

When an engineer invokes Copilot inside their IDE, the IDE sends the generated SQL to the AI service, which then forwards the request through hoop.dev. The gateway checks the user’s OIDC token, confirms the agent’s role, and then applies a policy that specifies which columns must be masked for that role. The policy runs in real time; the gateway rewrites the result set, stripping or tokenizing the protected fields before the response is handed back to the AI. Because the gateway records every request and response, it builds a complete audit trail automatically. Security teams can replay any session, see exactly which tables were queried, and verify that masking rules were applied correctly.

Just‑in‑time access and approvals

Beyond masking, hoop.dev can require a human approval for queries that touch high‑risk tables. When the AI attempts to read from a regulated schema, the gateway pauses the request and routes it to an approver. Only after the approver grants permission does the gateway forward the query, apply masking, and return the result. This just‑in‑time workflow ensures that even privileged AI agents cannot run dangerous queries without oversight.

Benefits of the gateway approach

All PostgreSQL traffic from Copilot funnels through a single control point.
Column‑level data masking enforces consistently, regardless of the client.
Every session records, providing evidence for audits and investigations.
Just‑in‑time approvals reduce the blast radius of accidental data exposure.
The solution is open source, MIT‑licensed, and can be self‑hosted behind your own network.

To try this architecture, start with hoop.dev’s quick‑start deployment. The documentation walks you through installing the gateway, configuring OIDC, defining masking policies for PostgreSQL, and connecting GitHub Copilot via the standard client. All of the heavy lifting, credential storage, token validation, and policy enforcement, happens inside the gateway, keeping the AI agent and your developers out of the trust loop.

For detailed steps, see the getting‑started guide. The full source code and contribution guidelines are available on GitHub. Once deployed, you can refine masking rules in the learn section and integrate the gateway with your existing identity provider.

FAQ

Q: Does hoop.dev store my PostgreSQL credentials?
A: The gateway holds the credential in memory only for the duration of a session. Users and AI agents never see the raw password, and the secret does not persist to disk.

Q: Can I mask different columns for different AI roles?
A: Yes. Masking policies tie to the identity token, so you can define fine‑grained rules per role, per schema, or per table.

Q: How does this help with compliance audits?
A: Because the gateway records every query and response, you have a complete, searchable audit log that shows who asked for what data, when, and whether masking was applied. This evidence satisfies many regulatory requirements without additional tooling.

Putting access controls around GitHub Copilot: data masking for AI coding agents (on Postgres)

Why data masking matters for GitHub Copilot

How the gateway enforces masking and audit

Just‑in‑time access and approvals

Benefits of the gateway approach

FAQ

Save the open-source gateway for agent data access