All posts

Configuring AI coding agents access to BigQuery with non-human identity

Why AI coding agents need non‑human identity for BigQuery Using non-human identity for AI coding agents is essential when they access BigQuery. When an AI coding agent runs queries against BigQuery using a shared service‑account key, a single credential can silently read every dataset, and any accidental or malicious query can exfiltrate gigabytes of data without a trace. The cost of that exposure is not just data loss; it can also trigger regulatory fines, damage brand trust, and force costly

Free White Paper

Non-Human Identity Management + AI Human-in-the-Loop Oversight: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

Why AI coding agents need non‑human identity for BigQuery

Using non-human identity for AI coding agents is essential when they access BigQuery. When an AI coding agent runs queries against BigQuery using a shared service‑account key, a single credential can silently read every dataset, and any accidental or malicious query can exfiltrate gigabytes of data without a trace. The cost of that exposure is not just data loss; it can also trigger regulatory fines, damage brand trust, and force costly incident response cycles. Relying on a static service‑account key also means the AI workload operates without a non‑human identity, making accountability impossible.

Non‑human identity alone isn’t enough

Organizations often solve the credential problem by assigning a dedicated service account to the agent. That step gives the agent a non-human identity, which is a prerequisite for any policy that distinguishes between human engineers and automated workloads. However, the connection still goes straight from the agent to BigQuery. The request bypasses any central enforcement point, so there is no real‑time approval workflow, no command‑level audit, and no ability to mask sensitive fields in query results. In practice this means the agent can run unrestricted SELECT or INSERT statements, and the organization has no evidence of who ran what, when, or why.

Beyond data loss, the use of a static service‑account key makes key rotation a manual, error‑prone task. If the key is ever compromised, every system that relies on it must be updated, and the window of exposure can stretch for days. Moreover, audit logs generated by BigQuery itself only capture the service‑account name, not the intent or the context of the AI‑driven request, leaving security teams blind to the origin of potentially risky queries.

Adopting a non‑human identity solves the credential‑sprawl problem, but it also introduces a new blind spot: the AI workload now acts as an autonomous principal with the same privileges as any human who could use the same service account. Without a gatekeeper, there is no way to enforce least‑privilege at the column level, no runtime check that a query complies with data‑handling policies, and no record that an automated job accessed a regulated dataset at a particular time.

hoop.dev as the gateway for secure AI‑driven BigQuery access

hoop.dev provides a Layer 7 gateway that sits between the AI agent and BigQuery. The gateway validates the agent’s OIDC token, extracts the service‑account identity, and then enforces policies before the request reaches the database. Because the enforcement happens in the data path, hoop.dev can apply just‑in‑time approvals, inline data masking, and immutable session recording for every query.

hoop.dev’s policy engine reads the claims in the OIDC token, such as the service‑account name, group memberships, and optional attributes, and matches them against a set of rules defined by the security team. Rules can require approval for any query that touches a PII column, enforce read‑only access for analytics workloads, or deny commands that attempt to modify schema. Because the engine runs inside the gateway, the enforcement point is immutable from the perspective of the agent, guaranteeing that no rogue configuration inside the AI container can bypass the controls.

Inline masking is configurable per dataset or per column. For example, a rule can replace Social Security Numbers with the pattern “XXX‑XX‑XXXX” while leaving other fields untouched. The masking occurs before the data leaves the gateway, ensuring that downstream services or logs never see the raw value. Each masked response is tagged in the audit trail, providing evidence that the data was handled according to policy.

Recorded sessions are stored in a store that is isolated from the production environment. Analysts can replay a session to see the exact query text, the parameters supplied by the AI agent, and the masked result set. This capability is essential for root‑cause investigations after a data breach, as it reconstructs the chain of events without needing to grant the analyst direct access to the production database.

Continue reading? Get the full guide.

Non-Human Identity Management + AI Human-in-the-Loop Oversight: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Deploying hoop.dev in a Kubernetes cluster or as a Docker Compose service allows you to scale the gateway horizontally. Each instance shares the same policy definitions, and the load balancer distributes incoming AI traffic evenly. Health checks and metrics are exposed for integration with existing observability stacks, so you can alert on unusual approval patterns or spikes in masked queries.

Common pitfalls include granting the service‑account broader permissions than necessary and neglecting to rotate the underlying key. hoop.dev mitigates the latter by keeping the credential confined to the gateway, but you still need a process to rotate it regularly and update the gateway’s secret store. Additionally, overly permissive approval rules can flood reviewers with benign requests, so start with a narrow set of high‑risk actions and expand as confidence grows.

Just‑in‑time approvals

When the agent attempts a high‑risk operation, such as creating a new table, exporting data, or running a query that touches a protected column, hoop.dev pauses the request and routes it to a designated approver. The approver can grant or deny the operation from a web UI, and the decision is logged together with the originating identity.

Inline data masking

For queries that return personally identifiable information or other regulated fields, hoop.dev can rewrite the response on the fly, replacing the sensitive values with masked placeholders. The original data never leaves the database unprotected, and the audit log records the masking rule that was applied.

Session recording and replay

Every query, its parameters, and the resulting data set are captured by hoop.dev. The recordings are stored outside the agent’s runtime, enabling post‑mortem analysis, compliance reporting, and forensic replay without exposing the raw credentials to the AI workload.

To wire an AI coding agent into this flow, you register a BigQuery connection in hoop.dev, point the agent’s client to the gateway address, and let hoop.dev handle the underlying service‑account credential. The agent never sees the secret key; it only presents its OIDC token. Detailed steps are covered in the getting‑started guide and the broader learn section.

FAQ

Can I keep using my existing service‑account key?

Yes. hoop.dev can store the key on the gateway and use it to authenticate to BigQuery on behalf of the agent. The key is never exposed to the agent process, preserving the non‑human identity model while adding audit and masking.

What happens if the approval step fails?

hoop.dev rejects the query and returns an error to the agent. The rejection event, together with the identity and the policy that triggered it, is recorded for later review.

Is the session data retained indefinitely?

Retention is configurable. You can set a policy that keeps recordings for the period required by your compliance framework, after which hoop.dev can purge or archive them according to your organization’s data‑retention rules.

Explore the open‑source repository on GitHub to see the implementation details and contribute: https://github.com/hoophq/hoop.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts