Agent impersonation: what it means for your blast radius (on BigQuery)

When blast radius is tightly bounded, a compromised service account can only affect the single dataset it was meant to query, and an audit system makes any suspicious query immediately visible in logs. Engineers can grant temporary, purpose‑limited access to a data analyst without exposing the entire data lake, and compliance reviewers can trace exactly who ran which query and when.

Current reality: unrestricted agent impersonation

In many organizations, service accounts are given the ability to impersonate any user in the cloud project. The impersonation token is then handed to scripts, CI pipelines, or third‑party agents that connect to BigQuery. Because the token carries the full set of permissions of the target identity, a single compromised script can issue arbitrary queries across all datasets. The result is a massive blast radius: a breach in one component instantly expands to the entire data warehouse.

This model also lacks visibility. The impersonation request is performed directly against the Google IAM endpoint, and the subsequent query travels straight to BigQuery. No intermediate system records the exact SQL statement, masks sensitive fields in the result set, or forces a human approval before a potentially destructive operation runs. The only evidence is the standard Cloud Audit log, which may not capture the full context of the request.

Why limiting impersonation alone does not close the gap

Applying a policy that restricts which identities can be impersonated is a necessary first step. It reduces the number of tokens that can be minted, but the request still reaches BigQuery directly. Without a control point on the data path, the following gaps remain:

There is no real‑time inspection of the query text, so dangerous commands such as DROP TABLE or massive export operations can execute unchecked.
Sensitive columns such as personally identifiable information are returned to the caller in clear text.
There is no workflow to require a manager or data‑owner to approve high‑impact queries before they run.
Session data is not recorded in a tamper‑evident store, making post‑incident forensics difficult.

In short, the setup defines who may start a request, but it does not enforce any of the protections that keep the blast radius small.

Placing a gateway in the data path

To truly bound blast radius, the request must pass through a Layer 7 gateway that can inspect, control, and record every interaction before it reaches BigQuery. hoop.dev provides exactly that: an identity‑aware proxy that sits between agents and the BigQuery endpoint. Because the gateway is the only point where traffic is allowed to flow, it can enforce policies that the upstream identity system cannot.

How hoop.dev reduces blast radius

hoop.dev records each query session. Every SQL statement is logged with the originating identity, timestamps, and the full result set size. hoop.dev creates an immutable audit trail that auditors can query to answer “who ran what and when”.

hoop.dev masks sensitive fields in real time. When a query returns columns marked as PII, the gateway replaces those values with tokenized placeholders before they reach the caller, preventing accidental data leakage.

hoop.dev blocks dangerous commands. The gateway can be configured to reject statements that match a pattern, for example, any DROP or EXPORT operation, and return an error before the command reaches BigQuery.

Continue reading? Get the full guide.

Blast Radius Reduction + Open Policy Agent (OPA): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

hoop.dev routes high‑impact queries to an approval workflow. If a query scans more than a defined number of rows or accesses a restricted dataset, the request is paused and a designated approver must explicitly allow it to continue.

All of these outcomes exist only because the gateway sits in the data path. The upstream identity provider still decides who may request a token, but the enforcement happens downstream, where the actual data movement occurs.

Practical steps to adopt a gateway for BigQuery

1. Deploy the gateway close to your BigQuery network. The quick‑start guide walks through a Docker Compose deployment that includes OIDC authentication and default masking policies. See the getting started guide for details.

2. Register your BigQuery project as a connection in the gateway configuration. The gateway stores the service account key, so users never see credentials.

3. Define masking rules for columns that contain sensitive data. The rule set lives in the gateway and can be updated without redeploying the BigQuery client.

4. Enable command‑level blocking and approval thresholds that align with your organization’s risk appetite. For example, require approval for any query that reads more than ten gigabytes of data.

5. Monitor the session logs produced by the gateway. They provide the evidence needed to demonstrate that blast radius is controlled and that every query is accountable.

FAQ

Does this approach eliminate the need for IAM policies?

No. IAM still determines which identities can obtain a token. The gateway adds a second, enforceable layer that controls what those identities can actually do once they reach BigQuery.

Can existing BigQuery clients be used unchanged?

Yes. Clients connect to the gateway using the same connection string they would use for BigQuery. The gateway transparently forwards traffic after applying its policies.

What happens if the gateway itself is compromised?

The gateway runs with its own service account and can be hardened with standard host‑level security controls. Because it does not store raw data, a breach would expose only policy configurations, not the underlying query results.

By moving the enforcement point to the data path, you shrink the blast radius of any compromised impersonation token and gain full visibility into every query that touches your data warehouse.

Explore the open‑source code and contribute improvements on GitHub.