All posts

LGPD for Headless Browsers

An offboarded contractor left a CI pipeline that spins up a headless Chrome instance to scrape user profiles for a marketing dashboard. The pipeline still runs nightly, pulling names, emails, and phone numbers while the team lacks visibility into who triggered the job or what data was extracted. When a data‑subject request arrives, the team cannot prove whether anyone accessed the information, let alone delete it from logs. Headless browsers are powerful automation tools, but they behave like a

Free White Paper

LGPD (Brazil): The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

An offboarded contractor left a CI pipeline that spins up a headless Chrome instance to scrape user profiles for a marketing dashboard. The pipeline still runs nightly, pulling names, emails, and phone numbers while the team lacks visibility into who triggered the job or what data was extracted. When a data‑subject request arrives, the team cannot prove whether anyone accessed the information, let alone delete it from logs.

Headless browsers are powerful automation tools, but they behave like any other client that can read and write personal data. Under Brazil's General Data Protection Law (lgpd), every processing activity must be documented, purpose‑limited, and protected against unauthorized exposure. The law obliges organizations to maintain detailed audit trails, apply technical safeguards such as masking, and enforce strict access controls that grant permission only for a defined purpose and time window.

In a typical setup, a developer writes a script that launches Chrome in headless mode, authenticates to a web application, and extracts fields that contain personal identifiers. The developer runs the script with a static service‑account token that has broad read permissions. Because the script contacts the target site directly, no gate intercepts the traffic. The team does not record the execution, does not mask the data, and any accidental over‑collection goes unnoticed until an audit or regulator raises a question.

What lgpd expects from automated data collection

lgpd defines several technical and organizational measures that organizations must apply when they process personal data:

  • Accountability: organizations must demonstrate how data is accessed, by whom, and for what purpose.
  • Purpose limitation and data minimization: organizations may collect only the data necessary for the declared purpose.
  • Security of processing: organizations must protect personal data with encryption, masking, or other techniques that prevent unauthorized disclosure.
  • Auditability: organizations must capture who performed each operation, the exact request parameters, and the response payloads that contain personal information.
  • Right to erasure and rectification: when a data‑subject request arrives, organizations must locate and delete or correct the relevant records.

Meeting these requirements with a headless browser alone is impossible because the browser itself does not provide built‑in governance. Teams need a control plane that sits between the automation script and the target service, enforcing policies at the protocol level.

How hoop.dev helps meet lgpd requirements

hoop.dev acts as a layer‑7 gateway that intercepts every request from a headless browser before it reaches the target web application. The gateway becomes the only place where enforcement can happen, turning the data path into a policy enforcement point.

Setup determines who may request access. Identity providers such as Okta or Azure AD issue OIDC tokens that identify the automation service account. The token tells hoop.dev which principal is requesting a session, but the token itself does not grant any permissions.

The data path is the gateway itself. Teams route all HTTP traffic from the headless browser through hoop.dev. Because the gateway parses each request and response, it can apply lgpd‑specific controls in real time.

Continue reading? Get the full guide.

LGPD (Brazil): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Enforcement outcomes appear only because hoop.dev sits in the data path. The gateway can:

  • Record every request and response, including timestamps, URLs, and payloads that contain personal identifiers. Teams use these logs as the audit evidence required by lgpd.
  • Mask sensitive fields such as email, CPF, or phone number in responses before they reach the automation script, ensuring data minimization.
  • Require just‑in‑time approval for requests that match high‑risk patterns, for example when the script accesses an endpoint that returns full user profiles or URLs under /users/*.
  • Block commands or API calls that attempt to export bulk personal data, preventing accidental over‑collection.
  • Replay recorded sessions so engineers can verify that the automation behaved as intended and locate the exact moment personal data was accessed.

Because hoop.dev never exposes the underlying credentials to the headless browser, the automation script cannot bypass the controls. The gateway also isolates the service account from direct network access, reducing the blast radius of a compromised script.

Implementing lgpd‑compliant headless browsing with hoop.dev

Deploy the gateway using the official getting‑started guide. The deployment runs a network‑resident agent close to the target web service, ensuring low latency for HTTP traffic.

Then, register the headless browser as a connection in hoop.dev. Define a policy that masks fields matching common personal data patterns (email, CPF, phone) and that flags any request to URLs under /users/* for manual approval. Teams write the policy in the configuration file; the learn section explains the syntax.

When the automation job starts, it obtains an OIDC token from the corporate IdP and presents the token to hoop.dev. The gateway validates the token, checks the policy, and either allows the request, masks the response, or pauses for an approver. Teams record every interaction, creating an audit trail that they can export for lgpd audits.

If a data‑subject request arrives, the compliance team searches the recorded sessions for the specific user identifier, retrieves the masked or original payload, and confirms whether the data was processed. When the team discovers original data, it issues a deletion command through the gateway, which purges the record from the audit store according to the retention policy.

Frequently asked questions

Does hoop.dev replace the need for encryption at rest?

No. hoop.dev focuses on in‑flight governance and audit. Teams must still encrypt stored logs and any persisted data using their chosen storage solution.

Can I use hoop.dev with any headless browser?

Yes. The gateway works at the HTTP protocol level, so teams can configure any browser that supports a proxy to route traffic through hoop.dev.

How does hoop.dev help with the right‑to‑erasure requirement?

Because the gateway records every session, teams can locate the exact request that extracted personal data and issue a targeted delete operation. The audit log also proves to regulators that the deletion occurred.

Get started today

Explore the open‑source repository on GitHub to see how the gateway is built and to contribute improvements: https://github.com/hoophq/hoop.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts