Why AI coding agents need non‑human identity for BigQuery
Using non-human identity for AI coding agents is essential when they access BigQuery. When an AI coding agent runs queries against BigQuery using a shared service‑account key, a single credential can silently read every dataset, and any accidental or malicious query can exfiltrate gigabytes of data without a trace. The cost of that exposure is not just data loss; it can also trigger regulatory fines, damage brand trust, and force costly incident response cycles. Relying on a static service‑account key also means the AI workload operates without a non‑human identity, making accountability impossible.
Non‑human identity alone isn’t enough
Organizations often solve the credential problem by assigning a dedicated service account to the agent. That step gives the agent a non-human identity, which is a prerequisite for any policy that distinguishes between human engineers and automated workloads. However, the connection still goes straight from the agent to BigQuery. The request bypasses any central enforcement point, so there is no real‑time approval workflow, no command‑level audit, and no ability to mask sensitive fields in query results. In practice this means the agent can run unrestricted SELECT or INSERT statements, and the organization has no evidence of who ran what, when, or why.
Beyond data loss, the use of a static service‑account key makes key rotation a manual, error‑prone task. If the key is ever compromised, every system that relies on it must be updated, and the window of exposure can stretch for days. Moreover, audit logs generated by BigQuery itself only capture the service‑account name, not the intent or the context of the AI‑driven request, leaving security teams blind to the origin of potentially risky queries.
Adopting a non‑human identity solves the credential‑sprawl problem, but it also introduces a new blind spot: the AI workload now acts as an autonomous principal with the same privileges as any human who could use the same service account. Without a gatekeeper, there is no way to enforce least‑privilege at the column level, no runtime check that a query complies with data‑handling policies, and no record that an automated job accessed a regulated dataset at a particular time.
hoop.dev as the gateway for secure AI‑driven BigQuery access
hoop.dev provides a Layer 7 gateway that sits between the AI agent and BigQuery. The gateway validates the agent’s OIDC token, extracts the service‑account identity, and then enforces policies before the request reaches the database. Because the enforcement happens in the data path, hoop.dev can apply just‑in‑time approvals, inline data masking, and immutable session recording for every query.
hoop.dev’s policy engine reads the claims in the OIDC token, such as the service‑account name, group memberships, and optional attributes, and matches them against a set of rules defined by the security team. Rules can require approval for any query that touches a PII column, enforce read‑only access for analytics workloads, or deny commands that attempt to modify schema. Because the engine runs inside the gateway, the enforcement point is immutable from the perspective of the agent, guaranteeing that no rogue configuration inside the AI container can bypass the controls.
Inline masking is configurable per dataset or per column. For example, a rule can replace Social Security Numbers with the pattern “XXX‑XX‑XXXX” while leaving other fields untouched. The masking occurs before the data leaves the gateway, ensuring that downstream services or logs never see the raw value. Each masked response is tagged in the audit trail, providing evidence that the data was handled according to policy.
Recorded sessions are stored in a store that is isolated from the production environment. Analysts can replay a session to see the exact query text, the parameters supplied by the AI agent, and the masked result set. This capability is essential for root‑cause investigations after a data breach, as it reconstructs the chain of events without needing to grant the analyst direct access to the production database.
