How can you prove that service accounts and automated jobs accessing BigQuery respect GDPR’s data‑subject rights and accountability requirements?
Most organizations treat non‑human identities like any other credential: a long‑lived key is stored in a vault, a CI pipeline injects it into a container, and the job runs with unrestricted read access to the data lake. The key never changes, the permissions are granted at the project level, and no one looks at the request after it is launched. When a data‑subject request arrives, the team can point to a list of keys, but there is no record of which job actually queried which rows, whether any personal data was filtered, or who approved the access. The result is a compliance blind spot that auditors love to highlight.
Why traditional setups fail to meet GDPR expectations
GDPR requires demonstrable accountability: every personal‑data operation must be traceable to a specific identity, purpose, and legal basis. With static service accounts, the identity that performed the query is the service account itself, not the engineer or the automation that triggered it. The request travels directly to BigQuery, bypassing any enforcement point that could log the exact SQL statement, mask columns containing personal data, or require a data‑privacy officer’s sign‑off before execution. In other words, the prerequisite of a controlled request is present – the service account exists and is allowed to connect – but the critical controls of audit, masking, and just‑in‑time approval are missing.
hoop.dev as the data‑path gateway for machine access
hoop.dev sits on the network between the identity provider and BigQuery. It is the only place where enforcement can happen because all traffic flows through the gateway before reaching the database.
Setup: identity and least‑privilege provisioning
Non‑human identities are defined in your IdP (Okta, Azure AD, Google Workspace, etc.) and issued short‑lived OIDC tokens. hoop.dev validates those tokens, extracts group membership, and maps the request to a narrowly scoped service‑account credential that can only query the specific dataset required for the job. This step decides who the request is and whether it may start, but it does not enforce any GDPR‑specific rule on its own.
Data path: the gateway enforces policy
hoop.dev receives the validated request, inspects the BigQuery wire protocol, and applies the policy engine before the query reaches the database. Because the gateway is the only point where the request can be examined, it can enforce masking of personal columns, block disallowed functions, and route high‑risk queries to a human approver.
Enforcement outcomes generated by hoop.dev
- hoop.dev records each session, capturing the full SQL statement, the identity that initiated it, and the exact response rows.
- hoop.dev masks sensitive fields in real time, ensuring that downstream logs never contain raw personal data.
- hoop.dev can be configured to request just‑in‑time approval for high‑risk queries, and it records the approval decision.
- hoop.dev records the audit trail and makes it available for export to regulators.
- hoop.dev never exposes the underlying service‑account credential to the caller, so the agent cannot leak the key.
These outcomes exist only because hoop.dev sits in the data path; removing it would revert the system to the insecure baseline described earlier.
The audit package you hand to regulators
When an audit request arrives, you can provide a concise evidence bundle generated by hoop.dev:
- A chronological log of every machine‑initiated query, including the OIDC token subject, timestamp, and purpose tag.
- Redacted query results that show personal data was masked according to your data‑classification policy.
- Approval records for any query that required a data‑privacy officer’s sign‑off, complete with reviewer identity and decision timestamp.
- Replay files that allow auditors to reconstruct the exact session and verify that no unauthorized data was extracted.
Because the gateway enforces policy at runtime, the evidence is trustworthy: it proves that the request was evaluated against GDPR controls before any data left BigQuery.
Getting started
Follow the getting‑started guide to deploy the gateway, register your BigQuery target, and configure OIDC authentication for your service accounts. The learn section explains how to define masking rules, set up just‑in‑time approvals, and export audit logs for compliance reporting.
Frequently asked questions
Do I need to change my existing CI pipelines?
No. The pipelines continue to invoke the standard bq client; the only change is that the client now points at the hoop.dev endpoint instead of BigQuery directly. All policy enforcement happens transparently in the gateway.
Can I retain raw query logs for internal debugging?
Yes. hoop.dev stores the full, unmasked query in a secure log that you can query with appropriate internal permissions. When you export data for GDPR evidence, the system automatically applies the same masking rules used at runtime.
What happens if a query is blocked?
hoop.dev returns a clear error to the caller and records the block event, including the rule that triggered it. This record becomes part of the audit trail you provide to regulators.
Find the open‑source repository on GitHub to explore the code, contribute improvements, or customize the gateway for your environment.