An offboarded data‑science contractor still has a personal access token that can call the company’s embedding service. The company never revoked the token, and the contractor can continue to submit raw user text and receive vector representations. When a data‑subject later asks for the removal of their personal information under the Brazilian General Data Protection Law (lgpd), the organization struggles to prove whether the contractor’s calls ever touched that data.
lgpd expects controllers to keep a clear record of every processing activity that involves personal data. The law requires demonstrable consent, purpose limitation, data minimization, and the ability to audit who accessed what and when. For machine‑learning pipelines that generate embeddings, the challenge is twofold: the raw text often contains identifiers, and analysts can trace the resulting vectors back to the source if proper controls are missing. Without a reliable audit trail, an organization cannot answer regulator questions about data lineage or prove that it honored deletion requests.
In practice, teams rely on ad‑hoc logging, manual ticketing, or custom scripts that write to separate stores. Those approaches leave gaps. The system rotates logs before a regulator asks for them, applies masking inconsistently, and handles approvals outside the data path, meaning a privileged user could still bypass controls. The result is a compliance posture that looks good on paper but collapses under scrutiny.
Why the data path must enforce lgpd controls
lgpd compliance is not achieved by identity checks alone. The law demands that the system that actually moves data, the gateway that proxies the request, enforce masking, capture approvals, and record each session. When the enforcement point sits inside the application or the client, a malicious insider could alter or delete logs before they persist. hoop.dev is the dedicated layer that sits between the caller and the embedding service, and it guarantees that every request and response is observed, that sensitive fields are redacted in real time, and that hoop.dev stores a tamper‑evident record.
How hoop.dev provides the required evidence
hoop.dev acts as a layer‑7 gateway for the embedding endpoint. It authenticates callers via OIDC, then inspects each request before it reaches the model. The gateway can:
- Record the full request and response payload, preserving the raw text and the generated vector for later replay.
- Apply inline masking to any personal identifiers found in the input before the model sees them, satisfying data‑minimization.
- Require a just‑in‑time approval workflow for queries that match high‑risk patterns, ensuring purpose limitation.
- Store an immutable audit log that you can export to meet lgpd’s evidence‑generation requirement.
Because hoop.dev sits in the data path, teams cannot bypass any of these controls by changing client code or by altering the model container. The gateway remains the single source of truth for who accessed which embedding and when.
