All posts

Sensitive Data Discovery for Non-Human Identities: A Practical Guide

How can you reliably perform sensitive data discovery for service accounts and other non‑human identities? Most teams treat a service account like a static password: they create a long‑lived credential, embed it in a CI pipeline, and never look at what that identity reads or writes. The result is a blind spot. Sensitive fields, API keys, customer PII, internal secrets, can slip through logs, configuration files, or database queries without any visibility. Because the identity is non‑human, ther

Free White Paper

Non-Human Identity Management + Managed Identities: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

How can you reliably perform sensitive data discovery for service accounts and other non‑human identities?

Most teams treat a service account like a static password: they create a long‑lived credential, embed it in a CI pipeline, and never look at what that identity reads or writes. The result is a blind spot. Sensitive fields, API keys, customer PII, internal secrets, can slip through logs, configuration files, or database queries without any visibility. Because the identity is non‑human, there is no natural audit trail, and traditional user‑centric discovery tools simply ignore it.

This lack of visibility makes compliance audits painful and increases the risk of accidental data leakage. Without a systematic way to surface the data that a non‑human identity touches, you cannot apply masking, alerting, or remediation consistently.

Sensitive data discovery for non‑human identities

Current practices rely on three pillars:

  • Setup. Engineers provision service accounts in an identity provider, assign them IAM roles or database permissions, and trust the credential to stay secret.
  • Direct connections. The service account connects straight to the target, database, Kubernetes API, SSH host, without an intervening proxy.
  • Ad‑hoc scans. Periodic scripts query logs or run data‑loss‑prevention tools, but they run outside the request flow and miss real‑time activity.

Setup determines who can start a session, but it does not enforce what that session can see. Direct connections bypass any enforcement point, so there is no guarantee that sensitive fields are masked or that a command is approved before execution. Ad‑hoc scans happen after the fact; they cannot block a dangerous query in the moment.

What a data‑path gateway must provide

To close the gap, the enforcement layer must sit on the data path. It should be the only place where traffic is inspected, because that is the only point where you can guarantee that every response is examined before it reaches the caller.

The gateway’s responsibilities include:

  • Recording each session so you have an immutable audit trail.
  • Masking fields that contain sensitive data in real time.
  • Requiring just‑in‑time approval for high‑risk commands.
  • Blocking commands that violate policy before they are sent to the target.

Only a component that sits in the data path can provide all of these outcomes simultaneously.

Continue reading? Get the full guide.

Non-Human Identity Management + Managed Identities: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Introducing hoop.dev as the enforcement layer

hoop.dev fulfills the role of that data‑path gateway. It sits between non‑human identities and the infrastructure they access. Because hoop.dev proxies the connection, every request and response passes through it, allowing the platform to apply the controls listed above.

Setup such as OIDC tokens or service‑account roles still decides which identity is allowed to start a session, but hoop.dev is the only place where enforcement happens. hoop.dev records each session, masks sensitive fields on the fly, and can pause a command for a human approver before it reaches the target. The result is a complete, real‑time view of what a non‑human identity is doing and the ability to intervene instantly.

How it works for non‑human identities

When a CI job or automation script initiates a connection, it presents an OIDC token that identifies the service account. hoop.dev validates the token, extracts group membership, and checks the policy associated with that identity. If the policy allows the requested operation, hoop.dev opens a proxied channel to the target, be it a PostgreSQL database, a Kubernetes API server, or an SSH host.

During the session, hoop.dev inspects each response. If a column named api_key or ssn appears, hoop.dev replaces the value with a placeholder, ensuring that downstream tools never see the raw secret. For commands that match a high‑risk pattern, such as DROP DATABASE or kubectl delete namespace, hoop.dev can trigger a just‑in‑time approval workflow, requiring a human to approve before the command proceeds.

All activity is logged. The audit record includes the identity, the exact command, the time, and whether any masking or approval took place. Because the gateway holds the credential, the service account never sees the underlying password or key, reducing the blast radius if the automation environment is compromised.

Getting started

To try this approach, follow the getting started guide. The documentation walks you through deploying the gateway, registering a non‑human identity, and configuring a simple policy that masks credit‑card numbers in database responses. For deeper details on masking, approval workflows, and audit storage, see the learn section.

FAQ

Does hoop.dev replace my existing IAM system?

No. hoop.dev relies on your existing identity provider for authentication. It adds a layer of enforcement on the data path, but it does not manage roles or permissions itself.

Can I use hoop.dev with any database?

hoop.dev supports the major relational and NoSQL databases listed in its documentation. The gateway works the same way for each: it proxies the protocol, inspects the payload, and applies masking or approval as defined by your policy.

Is there a performance impact?

Because hoop.dev operates at layer 7, there is a modest latency overhead for each request. In practice the trade‑off is acceptable for the security and audit benefits it provides, and you can tune the policy to balance performance and protection.

Explore the open‑source repository on GitHub to see the code, contribute, or file an issue: github.com/hoophq/hoop.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts