All posts

Non-human identity for autonomous agents on BigQuery

When autonomous agents query BigQuery, each request should be tied to a non-human identity, logged, and, when necessary, have sensitive fields redacted before they leave the data warehouse. In many pipelines a single Google service‑account key is baked into the container image and shared across dozens of bots. That key grants unrestricted read and write rights, so a bug or a compromised container can launch expensive scans or exfiltrate data without any trace of who, or what, issued the query.

Free White Paper

Non-Human Identity Management + Single Sign-On (SSO): The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

When autonomous agents query BigQuery, each request should be tied to a non-human identity, logged, and, when necessary, have sensitive fields redacted before they leave the data warehouse. In many pipelines a single Google service‑account key is baked into the container image and shared across dozens of bots. That key grants unrestricted read and write rights, so a bug or a compromised container can launch expensive scans or exfiltrate data without any trace of who, or what, issued the query.

Auditors expect every data access to be attributable to a distinct principal. Security teams also need the ability to block or approve risky queries before they hit the warehouse. The current practice of using a static credential solves attribution at the token level only if each agent presents its own OIDC token, but the request still travels straight to BigQuery. In other words, using non-human identity alone does not provide enforcement; the control surface where a query can be inspected, masked, or held for approval remains untouched.

To close that gap, the architecture must place a control point on the data path itself. The control point receives the agent’s identity, evaluates policy, can rewrite result sets, and records the full request‑response exchange. Only by sitting in the path between the autonomous process and BigQuery can the system enforce just‑in‑time access, inline masking, command‑level audit, and human approval workflows.

hoop.dev provides exactly that Layer 7 gateway. It runs a network‑resident agent inside the same environment as BigQuery, holds the credential needed to talk to the warehouse, and never exposes the secret to the calling process. When an autonomous agent initiates a query, it first authenticates to hoop.dev using its OIDC token. hoop.dev validates the token, extracts the non-human identity claims, and maps those claims to a policy that defines what the agent may do.

  • Just‑in‑time access: hoop.dev issues a short‑lived session token to the agent only for the duration of the request, eliminating long‑lived service‑account keys.
  • Inline data masking: hoop.dev can redact or replace sensitive fields in query results before they leave the gateway, ensuring downstream systems never see raw PII.
  • Command‑level audit: every SQL statement, together with the originating non-human identity, is recorded by hoop.dev. The logs are stored outside the agent process, giving auditors a reliable record of activity.
  • Human approval workflow: for queries that match a risky pattern, such as full‑table scans or exports, hoop.dev can pause execution and route the request to an approver. The approver’s decision is logged alongside the query.
  • Session replay: because hoop.dev records the full request‑response stream, incidents can be replayed step‑by‑step for forensic analysis.

All of these enforcement outcomes happen because hoop.dev sits in the data path. The identity setup (OIDC federation, per‑agent tokens) provides the principal, but hoop.dev is the only component that can actually intervene on the query, apply masking, and generate the audit trail.

Implementation overview

Deploying hoop.dev follows the standard quick‑start flow. A Docker Compose file launches the gateway and an accompanying network‑resident agent. The gateway is configured to trust your OIDC provider, whether that is Google, Okta, Azure AD, or another SAML source. Once the gateway is running, you register a BigQuery connection. During registration you specify that the gateway should use the GCP IAM federation flow, so each autonomous agent presents its own OIDC token and receives a scoped OAuth token for BigQuery.

Continue reading? Get the full guide.

Non-Human Identity Management + Single Sign-On (SSO): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

After registration, the agent’s runtime simply points its BigQuery client at the hoop.dev endpoint. From that point forward, every query passes through the gateway, where hoop.dev enforces the policies described above. The actual credential used to talk to BigQuery lives only inside the gateway, so the agent never sees a static service‑account key.

For detailed steps on deployment, token configuration, and policy authoring, see the getting‑started guide and the broader feature documentation at hoop.dev/learn. Those pages walk you through creating the Docker Compose deployment, linking your OIDC provider, and defining masking rules for specific columns.

FAQ

Do I still need a Google service‑account key?

No. hoop.dev holds the credential internally. Your autonomous agents authenticate with OIDC tokens, and hoop.dev exchanges those for short‑lived OAuth tokens on the fly.

Can I audit queries from a specific bot?

Yes. Because each bot presents a distinct non-human identity, hoop.dev logs include the principal name, the exact SQL statement, and the outcome. Those logs are searchable and can be exported for compliance reporting.

What happens if a query triggers a masking rule?

hoop.dev rewrites the result set on the fly, replacing or redacting the configured fields before they reach the calling process. The original values never leave the gateway.

Next steps

Start by cloning the repository and launching the quick‑start stack. Then configure your OIDC provider to issue tokens for your autonomous agents and register a BigQuery connection that uses GCP IAM federation. From there you can define masking policies, approval workflows, and audit retention settings that match your organization’s risk profile.

Explore the source code and contribute on GitHub.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts