All posts

Non-human identity: what it means for your data exfiltration (on Postgres)

Many teams assume that protecting a service account’s password is enough to stop data exfiltration. The reality is that the identity that runs a workload, whether a CI job, an AI‑driven agent, or a scheduled batch process, can be the vector that carries data out of a database. When a non‑human identity connects directly to Postgres, it inherits the same unrestricted network path as a human user, and the platform often lacks any visibility into what that identity is doing. How teams typically c

Free White Paper

Non-Human Identity Management + AI Data Exfiltration Prevention: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

Many teams assume that protecting a service account’s password is enough to stop data exfiltration. The reality is that the identity that runs a workload, whether a CI job, an AI‑driven agent, or a scheduled batch process, can be the vector that carries data out of a database. When a non‑human identity connects directly to Postgres, it inherits the same unrestricted network path as a human user, and the platform often lacks any visibility into what that identity is doing.

How teams typically connect today

In most environments, a service account is created once, its credentials are baked into configuration files or environment variables, and the same secret is reused by every automation job that needs database access. The connection goes straight from the host where the job runs to the Postgres instance over the internal network. Because the gateway sits nowhere in that path, there is no place to enforce query‑level policies, mask column values, or record the exact statements that were executed. Auditors therefore see only the fact that the service account was used, not the content of the queries or the volume of data that left the system.

What non‑human identity changes, and what it leaves untouched

Introducing non‑human identities, such as short‑lived OIDC tokens for CI pipelines or dedicated AI agent profiles, is a step toward least‑privilege access. The token can be scoped to a single database role, limiting the commands that can be run. However, the request still travels directly to Postgres. Without an intervening control plane, the database sees the request as if it came from a trusted client. No inline masking occurs, no command is blocked, and no approval workflow can intervene before a bulk export runs. The core problem, lack of an enforceable data path, remains.

Why a Layer 7 gateway is the missing piece

Placing a Layer 7, protocol‑aware gateway between the non‑human identity and the Postgres server creates a single enforcement surface. The gateway inspects each SQL statement as it passes through, applies policy rules, and records the full session for later replay. Because the gateway holds the database credential, the client never sees it, eliminating credential leakage at the edge.

Setup: identity and token provisioning

First, define the non‑human identity in your identity provider (Okta, Azure AD, Google Workspace, etc.). Issue short‑lived OIDC tokens that are mapped to a specific Postgres role. The token proves who the request is, but it does not grant direct network access; the request must still be routed through the gateway.

The data path: the gateway as the only place enforcement can happen

When the client presents its token, the gateway validates it, extracts the group membership, and determines the allowed SQL actions. Every query, SELECT, INSERT, COPY, or pg_dump, passes through this point. The gateway can:

Continue reading? Get the full guide.

Non-Human Identity Management + AI Data Exfiltration Prevention: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.
  • Mask sensitive columns, for example replacing credit‑card numbers with asterisks before they reach the client.
  • Block commands that attempt to write large result sets to external storage unless an explicit approval is recorded.
  • Require just‑in‑time approval for bulk export operations, routing the request to a human reviewer.
  • Record the full query text, timestamps, and the identity that issued it, creating a recorded audit trail.

All of these outcomes are possible only because hoop.dev sits in the data path. Without it, the non‑human identity would communicate directly with Postgres, and none of these controls could be applied.

Enforcement outcomes: stopping data exfiltration at the source

hoop.dev records each session, so any suspicious export can be traced back to the exact token and job that initiated it. Inline masking ensures that even if a compromised CI job runs a SELECT on a table containing personal data, the response never contains raw values. Command blocking prevents a malicious script from invoking a bulk copy without prior approval, effectively cutting off the most common pathway for large‑scale data theft. Finally, just‑in‑time approval adds a human decision point for high‑risk operations, turning an automated request into a controlled workflow.

Getting started

To see these controls in action, follow the getting started guide and explore the learn page for deeper details on masking, audit, and approval policies.

FAQ

Q: Does using a short‑lived token eliminate the need for a gateway?
A: No. The token limits what the identity can do, but without a gateway there is no place to enforce query‑level policies, mask data, or record the activity.

Q: Can hoop.dev prevent an attacker from exfiltrating data if they compromise a CI runner?
A: Yes. The gateway can block bulk export commands, require human approval for large result sets, and mask sensitive columns before any data leaves the database.

Q: How does the audit trail help with compliance?
A: Because hoop.dev records every statement together with the identity that issued it, auditors can demonstrate who accessed what data and when, satisfying evidence requirements for many standards.

Explore the source code, contribute improvements, and adapt the gateway to your own compliance needs on GitHub.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts