When a Retrieval‑Augmented Generation (RAG) pipeline leaks personal data, the organization faces heavy fines, loss of customer trust, and costly remediation. Under Brazil’s General Data Protection Law (LGPD), every access to personal information must be justified, recorded, and protected against accidental exposure.
Most teams build RAG solutions by stitching together a large language model, a vector store, and a backend database. Engineers often use a single service account that has read‑write rights on the entire data lake. The application connects directly to the database, and the logs show only generic error messages. No one sees which user triggered a particular query, no field‑level masking is applied, and there is no workflow to approve a request that touches sensitive records. In practice, the system provides no audit trail, no real‑time data protection, and no way to demonstrate compliance if an auditor asks for evidence.
LGPD requires that personal data be accessed only by authorized identities, that the purpose of each access be documented, and that any disclosure be limited to the minimum necessary. Adding an identity provider or tightening IAM policies is a necessary first step, but it leaves the request path unchanged: the application still talks straight to the database, bypassing any inline checks, masking, or logging. Without a control point in the data path, the organization cannot prove who read which record, cannot mask identifiers on the fly, and cannot require a human approval before a high‑risk query runs.
How lgpd requirements map to RAG pipelines
To satisfy LGPD, a RAG deployment must provide three core capabilities:
- Just‑in‑time (JIT) access that grants the minimum privilege for the duration of a query.
- Inline masking of personal identifiers in query results, ensuring that downstream LLM prompts never contain raw PII.
- Immutable audit evidence that records who accessed what data, when, and under which approval.
These capabilities can only be guaranteed when the enforcement point sits between the identity layer and the target infrastructure. That is where a Layer 7 gateway becomes essential.
Why hoop.dev is the only place enforcement can happen
hoop.dev is a Layer 7, identity‑aware proxy that sits in the data path for every RAG request. It receives the user’s OIDC token, validates the identity, and then proxies the connection to the underlying database, vector store, or HTTP service. Because the request travels through hoop.dev, the gateway applies the LGPD controls directly on the wire.
When a query reaches the gateway, hoop.dev can:
- Block the request until a designated approver signs off, providing a JIT approval workflow.
- Mask fields such as CPF, email, or phone number in the response before the data reaches the LLM, ensuring that the model never sees raw identifiers.
- Record the full session, including the original query, the masked response, and the identity of the requester, and store the log for replay during an audit.
- Enforce role‑based policies that limit which collections or tables a user may query, reducing the blast radius of a compromised credential.
All of these enforcement outcomes exist only because hoop.dev sits in the data path. If the gateway were removed, the same IAM setup would still allow the application to talk directly to the database, bypassing masking, approvals, and session recording.
