All posts

Putting access controls around ChatGPT: guardrails for AI coding agents (on BigQuery)

When an AI coding assistant can run unrestricted queries against your data warehouse, a single malformed prompt can expose sensitive customer records, trigger costly scans, or even violate compliance. Without guardrails, the risk of accidental data leakage or expensive workloads grows unchecked. Today many organizations let ChatGPT‑style agents talk directly to BigQuery by embedding a service‑account key in the prompt or by granting the model a broad IAM role. The agent receives the credential,

Free White Paper

AI Guardrails + Single Sign-On (SSO): The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

When an AI coding assistant can run unrestricted queries against your data warehouse, a single malformed prompt can expose sensitive customer records, trigger costly scans, or even violate compliance. Without guardrails, the risk of accidental data leakage or expensive workloads grows unchecked.

Today many organizations let ChatGPT‑style agents talk directly to BigQuery by embedding a service‑account key in the prompt or by granting the model a broad IAM role. The agent receives the credential, opens a connection, and executes whatever SQL it generates. There is no central point that can verify the intent of each query, no way to hide personally identifying columns, and no immutable log of who asked what. The result is a blind spot: engineers and auditors cannot tell whether a data‑driven insight came from a human analyst or an autonomous LLM, and any accidental data dump is difficult to trace.

What guardrails need to cover for AI coding agents

Effective guardrails must address three layers of risk:

  • Identity and least‑privilege. The agent should act under a non‑human identity that only has the permissions required for the specific workload.
  • Intent verification. Before a query reaches BigQuery, the system should be able to approve or reject it based on policy – for example, blocking DDL statements or queries that touch PII columns.
  • Audit and replay. Every request and response should be recorded in a secure log that can be reviewed to reconstruct the interaction.

These controls are impossible to enforce when the LLM talks directly to the data service. The connection path bypasses any policy engine, and the credential lives in the agent’s memory, making it easy to leak.

Introducing hoop.dev as the data‑path enforcement point

hoop.dev provides a Layer 7 gateway that sits between the AI agent and BigQuery. The gateway becomes the only place where traffic can be inspected, altered, or blocked. Because hoop.dev proxies the wire‑protocol, it can apply guardrails without requiring any changes to the client libraries that the LLM uses.

Setup. You configure an OIDC or SAML identity provider (Okta, Azure AD, Google Workspace, etc.) so that each request carries a token representing a non‑human service account. hoop.dev validates the token, extracts group membership, and maps it to a least‑privilege role that only permits SELECT on the specific dataset required for the task.

The data path. The AI agent connects to hoop.dev using the standard BigQuery client. hoop.dev then opens a connection to the real BigQuery endpoint on behalf of the agent. All SQL statements flow through hoop.dev first, giving it a single choke point for policy enforcement.

Continue reading? Get the full guide.

AI Guardrails + Single Sign-On (SSO): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Enforcement outcomes. hoop.dev evaluates each statement against the configured guardrails. It can:

  • Block dangerous commands such as DROP TABLE or CREATE USER.
  • Require human approval for queries that reference columns tagged as sensitive.
  • Mask PII fields in query results before they are returned to the LLM.
  • Record the full request and response, along with the identity that originated it, for later replay and audit.

Because the gateway is the only place that sees the credential, the AI agent never handles the raw service‑account key. The credential stays inside the hoop.dev agent that runs inside your network, reducing the blast radius of a compromise.

How to roll out guardrails for ChatGPT on BigQuery

1. Deploy the hoop.dev gateway. Use the provided Docker Compose quick‑start or the Kubernetes manifest to run the gateway near your BigQuery endpoint. The deployment includes built‑in OIDC verification and the masking engine.

2. Register the BigQuery connection. In the hoop.dev UI, add a new connection pointing at your BigQuery project. Supply a service‑account credential that has only the minimal SELECT permissions needed for the AI workload.

3. Define guardrail policies. Create rules that:

  • Reject any DDL or DML that modifies schema or data.
  • Require approval for queries that reference columns marked as PII.
  • Mask those columns in the response stream.

4. Bind the AI agent to a non‑human identity. Issue an OIDC client for the LLM service, assign it to a group that maps to the least‑privilege role you created, and configure the LLM to obtain a token from your IdP before calling the gateway.

5. Monitor and audit. Use the hoop.dev console or export the session logs to your SIEM. The logs contain the full SQL text, the masked results, the identity token, and any approval events.

With these steps, every query generated by ChatGPT passes through a controlled, observable path before it ever touches BigQuery. The guardrails enforce least‑privilege, prevent accidental data exposure, and give you a complete audit trail.

FAQ

  • Do I need to change my existing BigQuery client code? No. The client points at the hoop.dev endpoint instead of the native BigQuery URL, but the protocol remains identical.
  • Can I apply different guardrails per dataset? Yes. hoop.dev lets you scope policies to specific connections or even individual tables within a dataset.
  • What happens if an LLM tries to bypass the gateway? Because the gateway holds the credential, any direct connection attempt without a valid token is rejected by the network policy you place around the gateway.

Ready to start protecting AI‑driven analytics? Follow the getting‑started guide for a step‑by‑step deployment, then explore the full feature set on the learn page. The source code and contribution guidelines are available on GitHub.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts