Embedding services can expose raw user data to external models, making in‑transit data governance a real, immediate threat.
Why in-transit data governance matters for embeddings
Embeddings turn text, images, or code into high‑dimensional vectors. The input payload often contains personally identifiable information (PII), trade secrets, or regulated content. Once that payload leaves the originating system, the organization loses visibility into who saw it, when it was sent, and whether it was altered. In‑transit data governance is the set of policies and controls that protect data while it travels over the network, ensuring that sensitive fields are masked, that each request is authorized, and that an immutable audit trail is created.
Typical deployments send data directly from an application to a cloud‑hosted embedding endpoint. The connection uses a static API key or service account credential that is baked into code or configuration files. This approach has three glaring gaps:
- No real‑time masking: PII travels in clear text to the provider.
- No request‑level approval: Any developer or automated job can fire off unlimited queries.
- No audit of what was sent or received: Logs are often limited to generic HTTP status codes, making forensic analysis difficult.
What the precondition fixes – and what it still leaves open
Introducing a strict identity provider and short‑lived tokens solves the credential‑sprawl problem. Engineers now authenticate via OIDC, and the service only accepts tokens issued for a specific role. This step ensures that the request originates from a known identity, satisfying the “who” part of the equation.
However, the request still travels directly to the embedding service. The data path remains uncontrolled: there is no place to inspect the payload, strip sensitive fields, or require an approver before a high‑risk query is sent. In‑transit data governance is still missing because the enforcement point is absent.
hoop.dev as the data‑path enforcement layer
hoop.dev provides the missing layer by acting as an identity‑aware proxy that sits between the client and the embedding endpoint. The gateway intercepts every HTTP request, parses the JSON body, and can apply the following enforcement outcomes:
- Inline masking: hoop.dev removes or redacts PII from the request before it reaches the external model.
- Just‑in‑time approval: If a query contains a high‑risk pattern, the gateway routes it to a human approver instead of sending it automatically.
- Session recording: Each request and response is logged with the originating identity, creating a complete audit trail for compliance checks.
- Command blocking: The gateway can reject requests that exceed defined rate limits or contain disallowed content.
Because hoop.dev is the only point where traffic passes, these controls are guaranteed to be applied. The underlying identity system (OIDC/SAML) still decides who can start a session, but hoop.dev enforces the policy on the data path, ensuring that in‑transit data governance is truly enforced.
