All posts

How to Implement Non-Human Identities for RAG

An offboarded contractor’s CI job continues to call a retrieval‑augmented generation (RAG) pipeline using a hard‑coded service token that was never revoked. The token gives the job unrestricted read and write access to the vector store, the LLM endpoint, and any downstream APIs. Because the job runs under a non-human identity that is static and over‑privileged, there is no record of what was asked, no way to block a dangerous prompt, and no audit trail for compliance. Why static credentials br

Free White Paper

Non-Human Identity Management + Right to Erasure Implementation: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

An offboarded contractor’s CI job continues to call a retrieval‑augmented generation (RAG) pipeline using a hard‑coded service token that was never revoked. The token gives the job unrestricted read and write access to the vector store, the LLM endpoint, and any downstream APIs. Because the job runs under a non-human identity that is static and over‑privileged, there is no record of what was asked, no way to block a dangerous prompt, and no audit trail for compliance.

Why static credentials break RAG pipelines

Static secrets are attractive because they require no extra plumbing, but they create three hidden risks. First, the same credential is reused across environments, so a breach in one place compromises every RAG instance. Second, the credential carries more privileges than the job actually needs, inflating the blast radius of a mistake. Third, because the request bypasses any enforcement layer, there is no real‑time visibility into which prompts were sent or which responses contained sensitive data.

Non-human identity as a prerequisite

Replacing static secrets with a non-human identity solves the first two risks. An identity provider (IdP) can issue short‑lived OIDC tokens to each CI job, each token scoped to the exact RAG resources the job requires. The setup decides who the request is and whether it may start, but it does not by itself enforce policy. Without a gateway, the request would still travel directly to the vector store and LLM, leaving the audit and masking gaps unfilled.

Putting hoop.dev in the data path

hoop.dev acts as a Layer 7 gateway that sits between the non-human identity and the RAG backend. Because hoop.dev proxies the connection, it is the only place enforcement can happen. hoop.dev verifies the OIDC token, checks the job’s group membership, and then applies just‑in‑time approval, inline masking of personally identifiable information, and session recording before the request reaches the vector store or LLM. The enforcement outcomes – audit logs, masked responses, and replayable sessions – exist only because hoop.dev sits in the data path.

Designing policies for non-human identity

When you register a RAG resource in hoop.dev, you attach a policy document that references the non-human identity’s groups. Policies can require manual approval for write operations, automatically mask fields that match PII patterns, and reject any request that exceeds a defined token‑usage quota. Because the policy lives in the gateway, it cannot be bypassed by a compromised CI job; the job’s token is validated on each request and the policy is re‑evaluated in real time.

Continue reading? Get the full guide.

Non-Human Identity Management + Right to Erasure Implementation: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

High‑level implementation steps

  • Define a service account or machine‑user in your IdP for each CI job that will run a RAG workflow.
  • Configure the IdP to issue short‑lived OIDC tokens scoped to the specific RAG endpoints the job needs.
  • Deploy the hoop.dev gateway in the same network segment as your vector store and LLM. The gateway runs a network‑resident agent that holds the backend credentials; the CI job never sees them.
  • Register the RAG resources in hoop.dev, linking each to the appropriate credential and enabling policies such as "require approval for write operations" and "mask fields matching PII patterns".
  • Update your CI pipeline to connect to the RAG service through hoop.dev using the standard client libraries (e.g., the HTTP client for the vector store or the OpenAI‑compatible API for the LLM). The pipeline presents its OIDC token, and hoop.dev enforces the policies before forwarding the request.

For a step‑by‑step walkthrough, see the getting‑started guide and the broader learn page for policy design tips.

FAQ

Why should I use a non-human identity for a RAG pipeline?

Non-human identities let you issue short‑lived, scoped tokens instead of long‑lived secrets. This reduces the blast radius of a compromised credential and gives you a clear audit line from the token to the request.

How does hoop.dev enforce least‑privilege for a CI job?

hoop.dev validates the OIDC token, checks the job’s assigned groups, and only forwards requests that match the policy attached to that identity. If a job tries to access an unauthorized endpoint, hoop.dev blocks the request before it reaches the backend.

What audit evidence does hoop.dev provide for compliance?

hoop.dev records every session, captures the exact prompts sent to the LLM, masks any sensitive fields in the response, and stores a replayable log that can be queried by auditors. The logs are tied to the non-human identity that initiated the request, creating a complete chain of custody.

Explore the open‑source repository on GitHub to see the full configuration examples: https://github.com/hoophq/hoop.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts