All posts

Putting access controls around ChatGPT: session recording for AI coding agents (on BigQuery)

Do you need proof that every prompt sent to ChatGPT and every query it generates against BigQuery is captured for audit with reliable session recording? When engineering teams let large language models write SQL on the fly, the convenience is undeniable. A data analyst can describe a metric in plain English, the model translates it to a query, and the result appears in seconds. The downside is that the conversation lives only in the model's memory and the client’s console. If a query leaks PII,

Free White Paper

AI Session Recording + Single Sign-On (SSO): The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

Do you need proof that every prompt sent to ChatGPT and every query it generates against BigQuery is captured for audit with reliable session recording?

When engineering teams let large language models write SQL on the fly, the convenience is undeniable. A data analyst can describe a metric in plain English, the model translates it to a query, and the result appears in seconds. The downside is that the conversation lives only in the model's memory and the client’s console. If a query leaks PII, violates compliance, or simply runs an expensive scan, there is no independent record of who asked for it, what the model produced, or how the database responded.

Most organizations expose BigQuery to an AI‑coding agent through a static service account or a shared credential. The agent authenticates once, then streams prompts and results directly to the data warehouse. That setup satisfies the immediate need to run queries, but it leaves three gaps:

  • Without a gateway, there is no immutable log of the AI‑driven session, so post‑mortem investigations are guesswork.
  • Sensitive columns returned by the query are exposed to the model’s output without any redaction.
  • Any misuse, whether accidental or malicious, cannot be blocked in real time because the request bypasses a policy enforcement point.

In other words, the setup (identity provider, service account, and token) decides which AI agent may start a connection, but it does not enforce any guardrails. The request still reaches BigQuery directly, with no audit trail, no inline masking, and no chance to intervene.

Why session recording matters for AI coding agents

Session recording is more than a log file. It captures the full request‑response exchange at the protocol level, preserving timestamps, user identity, and the exact bytes transmitted. For AI‑generated queries this provides:

  • Evidence for auditors that every data‑access request originated from an authorized model run.
  • A replayable artifact that can be re‑executed in a sandbox to verify that the query behaved as expected.
  • A forensic trail that shows whether a model inadvertently exposed confidential fields.

Because the model’s output is not static code checked into version control, the only reliable source of truth is the recorded session itself.

Architectural pattern for recording AI‑driven BigQuery traffic

The recommended pattern inserts a Layer 7 gateway between the AI agent and BigQuery. The gateway terminates the client’s connection, inspects each wire‑protocol message, and then forwards it to the database using its own credential. This placement satisfies three requirements:

  1. Just‑in‑time credential use. The gateway holds the service‑account key; the AI agent never sees it.
  2. Policy enforcement at the data path. Masking, command‑level approval, and blocking happen where the traffic is inspected.
  3. Immutable session capture. The gateway records every request and response before it reaches BigQuery.

In practice the AI‑coding workflow looks like this:

  1. The analyst writes a natural‑language request to ChatGPT.
  2. The model returns a SQL string.
  3. The application sends the SQL to the gateway instead of directly to BigQuery.
  4. The gateway applies any configured masking rules, checks for required approvals, and streams the query to BigQuery.
  5. Both the inbound request and the outbound result are stored as a replayable session.

This approach isolates the database from the raw model output while preserving the convenience of AI‑assisted analytics.

Continue reading? Get the full guide.

AI Session Recording + Single Sign-On (SSO): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

How hoop.dev provides the data‑path enforcement

hoop.dev is an open‑source Layer 7 gateway that fits exactly into the pattern described above. It runs a network‑resident agent near BigQuery and proxies the connection. When an AI‑coding agent connects, hoop.dev validates the OIDC token, extracts the user’s group membership, and then decides whether the request may proceed.

From there, hoop.dev becomes the sole enforcement point. It:

  • Records each session. Every prompt, generated SQL, and database response is captured and stored for later replay.
  • Applies inline masking. Configurable field‑level rules can redact PII before the result is returned to the model.
  • Supports just‑in‑time approval. High‑risk queries can be routed to a human reviewer before they reach BigQuery.
  • Blocks disallowed commands. Dangerous statements such as DROP TABLE can be halted automatically.

Because hoop.dev sits in the data path, none of these outcomes are possible with the setup alone. If hoop.dev were removed, the AI agent would again have unfettered access to BigQuery and no session would be recorded.

Getting started is straightforward. The Getting started guide walks you through deploying the gateway with Docker Compose, configuring OIDC, and registering a BigQuery connection. The feature documentation explains how to define masking rules and approval workflows specific to AI‑generated queries.

Next steps

1. Deploy hoop.dev in a subnet that can reach your BigQuery endpoint.

2. Register a BigQuery connection and enable session recording in the gateway’s policy settings.

3. Update your AI‑coding integration to point at the gateway’s endpoint instead of directly at BigQuery.

4. Review the recorded sessions in the UI or export them for audit purposes.

With this architecture you retain the productivity boost of ChatGPT while gaining the visibility and control required for compliance and security.

FAQ

Is session recording optional?
Yes, you can enable or disable it per connection, but the security benefits disappear if you turn it off.

Does hoop.dev store raw credentials?
No. The gateway holds the service‑account key internally; the AI agent never receives it.

Can I use hoop.dev with other LLMs?
The gateway is protocol‑agnostic, so any AI service that produces SQL can be routed through it.

Ready to see the code? View the source on GitHub.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts