All posts

Putting access controls around ChatGPT: audit trails for AI coding agents (on BigQuery)

With comprehensive audit trails, every query a ChatGPT coding agent sends to BigQuery is recorded, approved, and its results masked as needed. Why audit trails matter for AI coding agents ChatGPT can be integrated into development pipelines to generate SQL, suggest schema changes, or even run exploratory queries on production data. When the model is granted a service‑account credential that talks directly to BigQuery, the organization loses visibility: the queries are invisible to security te

Free White Paper

AI Audit Trails + Single Sign-On (SSO): The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

With comprehensive audit trails, every query a ChatGPT coding agent sends to BigQuery is recorded, approved, and its results masked as needed.

Why audit trails matter for AI coding agents

ChatGPT can be integrated into development pipelines to generate SQL, suggest schema changes, or even run exploratory queries on production data. When the model is granted a service‑account credential that talks directly to BigQuery, the organization loses visibility: the queries are invisible to security teams, there is no record of who triggered them, and sensitive columns can be returned unchecked. In practice this means a single mis‑prompt can cause data leakage, cost spikes, or compliance violations without any evidence to investigate.

The unsanitized starting state

Most teams provision a static credential for the AI agent, store it in a secret manager, and let the agent use the standard BigQuery client libraries. The credential is a long‑lived token that can read, write, and export tables. Access is granted broadly, often at the project level, and the agent connects straight to the Google Cloud endpoint. There is no gate that can enforce per‑query approvals, no mechanism to mask personally identifiable information in query results, and no replayable log of the session. The only audit that exists is the generic Cloud audit log, which does not capture the exact prompt that generated the query.

What the identity layer can do – and what it cannot

Moving the credential into an OIDC‑issued service account is a necessary first step. The service account can be scoped to a specific dataset, and the token can be short‑lived, reducing the blast radius of a stolen secret. However, the request still travels directly from the agent to BigQuery. The identity system decides *who* the request is, but it does not inspect the actual SQL, cannot require a human to approve a risky operation, and cannot guarantee that a record of the interaction is kept outside the agent’s process.

hoop.dev as the data‑path enforcement point

hoop.dev is a Layer 7 gateway that sits between the AI agent and BigQuery. The agent authenticates to hoop.dev with its OIDC token; hoop.dev validates the token, extracts group membership, and then proxies the request to BigQuery using its own credential. Because every packet passes through hoop.dev, the gateway can enforce the missing controls:

  • Audit trails: hoop.dev records each query, the identity that issued it, and the full response. The log is stored outside the agent, providing a reliable audit record for investigations.
  • Just‑in‑time approval: if a query matches a risky pattern, such as a SELECT * on a PII‑rich table, hoop.dev can pause the request and route it to an approver before execution.
  • Inline masking: response rows can have sensitive columns redacted or tokenised in real time, ensuring that downstream logs never contain raw personal data.
  • Session recording and replay: the entire interaction, including prompts and results, is captured for later replay, which is essential for post‑mortem analysis.

All of these outcomes exist only because hoop.dev occupies the data path. The identity layer alone cannot provide them.

Continue reading? Get the full guide.

AI Audit Trails + Single Sign-On (SSO): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Architectural sketch

1. Deploy hoop.dev’s gateway in the same VPC or network segment where BigQuery is reachable. The quick‑start guide walks through a Docker‑Compose deployment that includes OIDC configuration.

2. Register BigQuery as a connection in hoop.dev, supplying the service‑account credential that the gateway will use. The agent never sees this secret.

3. Define policies that tag datasets containing sensitive data. Policies can require approval for any query that accesses those tags, and can specify which fields to mask.

4. When the ChatGPT‑driven application issues a query, it connects to hoop.dev using the standard BigQuery client libraries. hoop.dev validates the token, checks the policy, applies any masking, records the session, and forwards the request to BigQuery.

Getting started

For a step‑by‑step walkthrough, see the getting‑started guide. The documentation also covers policy definition, masking rules, and approval workflow configuration in the learn section. The full source code and deployment manifests are available on GitHub.

FAQ

  • Do I need to change my existing BigQuery client code? No. hoop.dev acts as a transparent proxy; the client continues to use the standard BigQuery endpoint and libraries.
  • Can I still use existing service‑account keys? Yes, but they should be stored only in hoop.dev’s configuration. The AI agent never accesses them directly.
  • How are audit logs retained? hoop.dev writes logs to a configurable backend that is independent of the agent’s runtime, ensuring that logs survive even if the agent is compromised.

Implementing audit trails for AI coding agents does not require a patch to the model itself; it requires a control point where every request can be inspected. hoop.dev provides that control point, turning opaque AI‑driven queries into fully auditable, approvable, and masked operations.

Explore the open‑source repository on GitHub to start building your own secure AI‑to‑BigQuery pipeline.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts