A Guide to RBAC in Embeddings

When every embedding request is processed only by users who are explicitly authorized through rbac, the risk of data leakage and model misuse disappears.

In many organizations the first version of an embedding service is a single endpoint protected by a static API key or a shared service account. Engineers embed the model from notebooks, CI pipelines, and micro‑services using that same credential. The key is checked into repositories, duplicated across environments, and rarely rotated. No individual identity is attached to a request, and the platform does not record who asked for which vector.

This approach creates three concrete problems. First, any compromised secret grants unrestricted access to the model, allowing an attacker to exfiltrate proprietary data or generate malicious content. Second, because the request carries no user context, auditors cannot answer who produced a particular embedding, making compliance impossible. Third, the lack of granularity prevents teams from limiting high‑risk prompts to a small set of experts.

Why rbac matters for embedding services

Role‑based access control (rbac) solves the identity problem by assigning permissions to roles rather than to individual users. In the embedding world that means defining roles such as data scientist, model ops, or audit viewer and mapping each role to a set of allowed operations – for example, read‑only access to public vectors versus write‑only access to private customer data.

Applying rbac alone, however, does not close the remaining gaps. Even when a policy engine checks a user’s role before forwarding a request, the call still travels directly to the model server. The gateway that sits between the user and the model is missing, so the system still lacks request‑level audit, real‑time masking of sensitive inputs, and just‑in‑time approval for risky prompts.

hoop.dev as the data‑path enforcement point

hoop.dev provides the missing layer. It acts as an identity‑aware proxy that sits in the data path between the caller and the embedding endpoint. When a user presents an OIDC token, hoop.dev validates the token, extracts group membership, and translates that membership into rbac decisions. The gateway then either allows the request, blocks it, or routes it for manual approval based on the defined role policies.

Continue reading? Get the full guide.

Just-in-Time Access + Azure RBAC: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Because hoop.dev is the only component that sees the traffic, it can enforce several outcomes that are impossible with a bare API key. hoop.dev records each embedding request, preserving a replayable audit trail that ties every vector to a user and a timestamp. It can mask sensitive fields in the request or response, ensuring that personally identifiable information never leaves the controlled environment. For high‑risk prompts, hoop.dev can require just‑in‑time approval from a designated reviewer before the model processes the input. Finally, it blocks disallowed operations such as attempts to extract model weights or to generate content that violates policy.

Setting up the enforcement chain

The setup begins with an identity provider that issues OIDC tokens – any provider that supports standard OIDC flows works. Administrators create groups that correspond to the rbac roles required for the embedding service. hoop.dev is then configured with a connection to the target model endpoint; the gateway holds the service credentials, so callers never see them. Once the connection is registered, policies map groups to allowed operations, and the gateway starts mediating traffic.

From a security standpoint, this architecture satisfies the principle of least privilege. Users receive only the permissions they need, and every action is recorded in a central log managed by hoop.dev.

Benefits for compliance and operations

Granular rbac enforcement at the protocol layer.
Audit records that support internal reviews and external audits.
Inline masking of sensitive inputs and outputs.
Just‑in‑time approval workflow for high‑risk embeddings.
Credential management isolated from end users.

Teams that adopt this pattern can demonstrate that they control who accesses embedding models, how they access them, and what data flows in and out. The evidence generated by hoop.dev can be supplied to auditors without exposing raw secrets.

Getting started

To explore this architecture, start with the getting‑started guide that walks you through deploying the gateway and registering an embedding endpoint. The learn section contains deeper discussions of rbac policy design and audit‑log retrieval.

For a full view of the open‑source implementation, visit the repository on GitHub:

https://github.com/hoophq/hoop

FAQ

Do I need to change my existing embedding client code?No. hoop.dev accepts standard client connections (HTTP, gRPC, etc.). You point your client at the gateway address instead of the raw model endpoint.Can hoop.dev enforce rbac for third‑party hosted models?Yes. As long as the model is reachable from the network where the gateway runs, hoop.dev can proxy the traffic and apply the same policies.What happens if the gateway is unavailable?Requests are blocked until the gateway recovers, preventing any direct bypass of the enforcement layer.