An offboarded contractor’s nightly data‑processing job still runs, pulling large chunks of customer records from a warehouse. The script uses a static credential stored in a CI secret and writes the raw rows to a log file that no one monitors. When a regulator asks for proof that the organization respects the Brazilian General Data Protection Law, the team can’t point to any record of who accessed which data, how it was filtered, or whether the extraction was approved.
lgpd requires that any personal data processing be documented, that data minimisation be enforced, and that individuals’ rights to access, correction and deletion be demonstrable. For chunking workloads, which are batch jobs that retrieve slices of a database or data lake, this translates into three concrete obligations:
- Every request that extracts personal data must be tied to a verified identity.
- Sensitive fields must be masked or redacted before they leave the controlled environment.
- A tamper‑evident audit trail must capture who asked for the chunk, when, what size, and whether an approval workflow was satisfied.
Most teams build their pipelines with a handful of service accounts, grant those accounts broad read privileges, and let the job run unattended. The connection goes straight from the compute node to the database, bypassing any central policy engine. The result is a blind spot: the setup decides who can start the job, but it provides no enforcement on the data path, and there is no reliable evidence that lgpd‑required controls were applied.
The missing piece is a gateway that sits between the identity layer and the chunking target. The gateway must be the only place where request validation, masking, approval and logging occur. Without that data‑path enforcement, any audit the organization produces would be incomplete, and a regulator could easily deem the practice non‑compliant.
hoop.dev is a layer‑7 gateway that proxies connections to databases, storage services and other chunkable resources. It sits in the data path, intercepts each request, and applies policy before the traffic reaches the target. The system integrates with standard OIDC or SAML identity providers, so the same tokens that grant access to the CI system also authenticate to the gateway.
How lgpd defines evidence for chunking
lgpd treats personal data as any information that can identify a natural person. When a chunking job requests a subset of rows, the law expects the organization to prove:
- Identity of the requester – captured from the verified token.
- Purpose and scope of the extraction – recorded in an approval record.
- Data minimisation – enforced by masking or column‑level filters before data leaves the controlled zone.
- Retention of the audit log – immutable evidence that can be presented to auditors.
hoop.dev provides each of these elements directly in the data path. It reads the caller’s identity, checks the request against a policy that defines allowed tables, columns and row limits, and, if the request exceeds a pre‑defined threshold, routes it to a human approver. Once approved, hoop.dev masks any fields marked as personal data, then forwards the sanitized chunk to the downstream job.
