All posts

PCI DSS for non-human identities: governing machine access end to end (on BigQuery)

How can you prove to a PCI DSS auditor that every machine‑to‑machine request against your data warehouse is both authorized and traceable? Most organizations rely on long‑lived service accounts or static API keys for jobs, ETL pipelines, and automated analytics. Those credentials are often shared across teams, never rotate, and are used directly by the client library that talks to BigQuery. The result is a black box: the request reaches the data service, but the organization has no single point

Free White Paper

PCI DSS + End-to-End Encryption: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

How can you prove to a PCI DSS auditor that every machine‑to‑machine request against your data warehouse is both authorized and traceable?

Most organizations rely on long‑lived service accounts or static API keys for jobs, ETL pipelines, and automated analytics. Those credentials are often shared across teams, never rotate, and are used directly by the client library that talks to BigQuery. The result is a black box: the request reaches the data service, but the organization has no single point that can verify who initiated the query, whether it complied with policy, or whether sensitive cardholder data was exposed.

PCI DSS explicitly requires unique identification (requirement 8), controlled access (requirement 7), and detailed logging of all access to cardholder data environments (requirement 10). When a machine identity is the source, the auditor expects to see per‑entity logs, evidence of least‑privilege assignment, and proof that any data leakage was prevented or detected.

Why pci dss demands full visibility into machine access

Requirement 10.2 asks for a record of each individual’s access to cardholder data, including date, time, and the action performed. Requirement 10.5 adds that logs must be retained for at least 12 months and be protected from alteration.

Even if you federate those identities through an OIDC provider and assign them minimal roles, the request still travels straight to BigQuery. The data service sees only the service account’s token; it cannot enforce additional guardrails, mask PAN fields, or require a human approval before a bulk export. Without a gateway in the data path, the organization cannot generate the granular audit artifacts the auditor will request.

What a gateway can enforce that identity alone cannot

The missing piece is a layer that sits between the authenticated identity and the target service. That layer can:

  • Record every query together with the caller’s identity, timestamp, and source IP.
  • Apply inline masking to any response that contains primary account numbers, ensuring that downstream tools never see raw PANs.
  • Require just‑in‑time approval for high‑risk commands such as large data exports or schema changes.
  • Block disallowed statements before they reach BigQuery, preventing accidental or malicious data exfiltration.

These enforcement outcomes exist only because the gateway sits in the data path; the identity system alone cannot provide them.

Continue reading? Get the full guide.

PCI DSS + End-to-End Encryption: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

How hoop.dev creates audit‑ready evidence for BigQuery

hoop.dev is an open‑source Layer 7 gateway that proxies connections to infrastructure services, including BigQuery. The flow works like this:

  1. Users and automated agents obtain an OIDC token from the corporate IdP. hoop.dev validates the token and extracts the group membership that determines the level of access.
  2. The request is handed to the hoop.dev gateway, which then opens a connection to BigQuery using a credential that only the gateway knows.
  3. Before the query is sent, hoop.dev checks the statement against policy rules. If the query exceeds a defined risk threshold, a human approver must authorize it.
  4. After execution, hoop.dev inspects the result set and masks any fields that match a PAN pattern. The masked result is returned to the client.
  5. Every session, including identity, query text (masked where required), approval record, and timestamps, is stored as an audit record.

The resulting artifacts satisfy the PCI DSS evidence checklist:

  • Unique identifier for each non‑human caller.
  • Timestamped log of every query, including success or failure.
  • Masking logs that demonstrate PANs never left the gateway unprotected.
  • Approval workflow records for high‑risk operations.
  • Replay capability that lets auditors reproduce a session exactly as it occurred.

Key artifacts you hand to an auditor

When the audit window opens, you can export the following from hoop.dev:

  • A CSV or JSON dump of session logs, each row containing the service account name, OIDC subject, query text (with masked fields), and a hash identifier for integrity verification.
  • Approval audit trails that show who approved a bulk export, when, and under what justification.
  • Masking rule definitions that map column names or regex patterns to PCI‑required redaction.
  • Retention configuration confirming that logs are kept for the required 12‑month period.

These files provide the concrete proof the PCI DSS assessor looks for: “Who accessed cardholder data, what did they do, and was the access controlled and monitored?”

Getting started

Deploy the hoop.dev gateway in the same network segment as your BigQuery proxy endpoint. The quick‑start guide walks you through a Docker‑Compose deployment, OIDC configuration, and registration of a BigQuery connection. Once the gateway is running, define masking rules for the PAN column and enable just‑in‑time approvals for queries that exceed a row‑count threshold. Detailed steps are available in the getting‑started documentation and the broader learn section.

FAQ

Do I need to replace existing service accounts?
No. hoop.dev authenticates callers via OIDC, so your existing service accounts can continue to be used by the applications that generate the tokens. The gateway simply proxies the request using its own credential to BigQuery.

How does inline masking protect PANs?
The gateway inspects the response at the protocol layer, applies a pattern‑based redaction to any field that matches a PAN format, and then forwards the sanitized data. The original values never leave the gateway process.

Can the audit logs be retained for the PCI‑required 12 months?
Yes. hoop.dev can be configured to store logs in a durable backend for the duration required by PCI DSS, and the retention policy is part of the exported evidence.

Ready to see the audit trail in action? Explore the open‑source code and start a deployment at https://github.com/hoophq/hoop.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts