All posts

Non-human identity: what it means for your data exfiltration (on on-prem)

Every non‑human service account that can pull data out of an on‑prem system is a potential data exfiltration vector. In many organizations the same API key or SSH key is baked into dozens of automation jobs, CI pipelines, and batch scripts. The credential lives in source control, on shared disks, or in environment variables that multiple engineers can read. Because the identity is not tied to a single person, there is no natural audit trail that shows who initiated a request, what data was retu

Free White Paper

Non-Human Identity Management + AI Data Exfiltration Prevention: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

Every non‑human service account that can pull data out of an on‑prem system is a potential data exfiltration vector.

In many organizations the same API key or SSH key is baked into dozens of automation jobs, CI pipelines, and batch scripts. The credential lives in source control, on shared disks, or in environment variables that multiple engineers can read. Because the identity is not tied to a single person, there is no natural audit trail that shows who initiated a request, what data was returned, or whether the response was later forwarded to an external destination.

When a breach occurs, investigators often find that a service account was used to dump tables, archive log files, or copy configuration snapshots to an attacker‑controlled host. The lack of per‑request visibility makes it impossible to prove whether the activity was legitimate automation or malicious exfiltration. The risk is amplified on‑prem where network segmentation is weaker and data often resides in legacy databases that lack built‑in activity logging.

Why non‑human identities are a blind spot

Setup mechanisms such as OIDC federation or static service‑account tokens decide who a request *could* be, but they do not enforce what the request *does*. A service account may be granted read‑only access to a database, yet nothing stops the holder from issuing a SELECT * FROM users and piping the result to a remote server. The identity system alone cannot block that behavior because the enforcement point is missing.

Most teams rely on perimeter firewalls or host‑based controls, assuming that because the credential is “internal” it cannot be abused. That assumption fails when a compromised build server or a malicious insider re‑uses the same token to reach downstream systems. The data path between the identity provider and the target resource is completely open, providing no place to inspect, approve, or redact traffic.

The missing enforcement layer

The precondition for a secure environment is simple: you must still use non‑human identities for automation, but every request must pass through a point where policy can be applied. Without that point, the request reaches the database, Kubernetes API, or SSH daemon directly, leaving three gaps:

  • No real‑time audit of the exact query or command that was executed.
  • No ability to mask sensitive columns or fields before they leave the target.
  • No workflow to require human approval for high‑risk operations such as bulk exports.

These gaps exist even when the setup stage correctly scopes the service account to the minimum required permissions. The enforcement outcomes, session recording, inline masking, just‑in‑time approval, command blocking, cannot be achieved without a dedicated gateway in the data path.

Continue reading? Get the full guide.

Non-Human Identity Management + AI Data Exfiltration Prevention: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

How hoop.dev stops data exfiltration

hoop.dev is a Layer 7 gateway that sits between the identity system and the on‑prem resource. It is the only component that can inspect protocol‑level traffic, apply policies, and produce evidence for auditors. Because the gateway terminates the connection, it becomes the authoritative place for enforcement.

When a service account initiates a connection, hoop.dev validates the OIDC or SAML token, extracts group membership, and then forwards the request to the target only after applying the configured guardrails. The gateway records each session, so an auditor can replay the exact sequence of commands that led to a data export. If a query returns columns marked as sensitive, hoop.dev masks those fields in‑flight, ensuring that downstream logs or screenshots never contain raw values.

For operations that exceed a predefined risk threshold, such as extracting more than 10 GB of data or accessing tables that contain personally identifiable information, hoop.dev routes the request to a human approver. The approver can grant a one‑time token, after which the gateway lets the request continue. If the request is disallowed, hoop.dev blocks the command before it reaches the database, preventing the exfiltration attempt entirely.

All of these enforcement outcomes happen because hoop.dev is positioned in the data path; the service account never talks directly to the target. The setup stage still decides *who* can request access, but hoop.dev is the only place that decides *what* they are allowed to do.

Getting started with a secure non‑human identity flow

To adopt this model, begin with the getting‑started guide. Deploy the gateway near your on‑prem assets, register each service account as a connection, and define masking and approval policies that match your risk appetite. The learn page provides deeper examples of policy syntax and replay workflows.

Because hoop.dev is open source, you can inspect the code, contribute improvements, or run it behind your own firewall. The repository includes Helm charts and Docker Compose files that simplify deployment in a variety of environments.

FAQ

Does hoop.dev replace existing IAM policies? No. IAM or OIDC configuration still determines which service accounts are allowed to request a connection. hoop.dev augments that decision by enforcing what those accounts can actually do once the request reaches the data path.

Can hoop.dev mask data without affecting application logic? Yes. Masking is applied only to the response stream that passes through the gateway, leaving the underlying database unchanged. Applications continue to receive the masked view, while the original values remain protected in storage.

What evidence does hoop.dev provide for auditors? Each session is recorded with timestamps, user identity, and the exact commands issued. Approvals and masking actions are logged alongside the session, giving a complete audit trail that satisfies most compliance frameworks.

Explore the open‑source repository on GitHub to see the implementation details and contribute to the project.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts