SOC 2 for AI agents: controlling access for audit-ready operations (on BigQuery)

When an AI agent runs queries against a data warehouse without clear oversight, a single mis‑typed statement can expose millions of rows, trigger regulatory fines, and erode customer trust. The financial impact of a data‑leakage incident often dwarfs the cost of implementing proper controls, especially for organizations that must satisfy SOC 2 audits.

In many teams, the quickest way to get an agent working is to create a service‑account key that has broad bigquery.* permissions, store the JSON file in a secret manager, and hand the key to the model at runtime. The agent then connects directly to BigQuery, runs any query it deems useful, and writes results to storage. There is no per‑query audit, no way to mask personally identifiable information, and no approval step before a potentially destructive operation runs.

This approach satisfies the immediate need for speed, but it leaves three critical gaps: the request still reaches BigQuery directly, there is no immutable record of what the agent asked for, and there is no mechanism to hide or redact sensitive fields before they leave the warehouse. For SOC 2, those gaps translate into missing evidence for the Security and Confidentiality principles.

SOC 2 evidence requirements for data‑warehouse access

SOC 2 auditors look for continuous, verifiable proof that only authorized identities performed privileged actions, that those actions were logged in an immutable fashion, and that any exposure of sensitive data was mitigated. In practice, this means collecting:

Authentication and authorization metadata for each request.
Fine‑grained audit records that capture the exact query text, timestamps, and the identity that initiated it.
Evidence of data‑masking or redaction applied to protected columns.
Approval trails for queries that exceed predefined risk thresholds.

When the control surface is spread across multiple tools, identity providers, secret stores, and the database itself, building a single, auditable chain becomes fragile. Missing a log entry or a mis‑configured mask can cause an audit failure.

Why a gateway is the only reliable enforcement point

The setup phase determines who the AI agent is: an OIDC‑issued service principal that belongs to a specific team, with a role that limits its scope to a single BigQuery project. This identity information is essential, but on its own it cannot enforce runtime policies. The enforcement must happen where the traffic flows, not in a peripheral system.

hoop.dev acts as a Layer 7 gateway that sits between the AI agent and BigQuery. All query traffic is proxied through the gateway, which inspects the wire‑protocol, applies policy, and then forwards the request to the warehouse. Because the gateway is the only place the data passes, it is the sole point where masking, approval, and recording can be guaranteed.

Continue reading? Get the full guide.

AI Audit Trails + SOC Operations: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Enforcement outcomes delivered by the gateway

hoop.dev records each query session, preserving the full request and response for replay during an audit.
hoop.dev masks configured sensitive fields in real time, ensuring that downstream systems never see raw PII.
hoop.dev routes high‑risk queries to a human approver before they are executed, providing a documented approval trail.
hoop.dev blocks commands that match a deny list, preventing accidental data deletion.

Each of these outcomes exists only because hoop.dev sits in the data path; removing the gateway would instantly eliminate the audit log, the masking, and the approval requirement.

How the continuous evidence stream satisfies SOC 2

Because hoop.dev captures every request and response, organizations receive a ready‑made log that maps directly to the SOC 2 Security principle. The logs contain the identity of the AI agent, the exact query text, and a timestamp, which satisfies the requirement for traceability. Masking records indicate which columns were redacted, giving auditors proof of confidentiality controls. Approval entries show that privileged queries were reviewed, meeting the principle of risk‑based access.

All of this evidence is stored outside the BigQuery process and managed by the gateway. The separation ensures that even if the warehouse is compromised, the audit trail remains intact, a key consideration for the Availability and Processing Integrity criteria of SOC 2.

Getting started with hoop.dev for AI‑driven BigQuery access

Begin by defining a service principal in your identity provider and granting it the minimal BigQuery role required for the agent’s workload. Deploy the hoop.dev gateway using the Docker Compose quick‑start, which automatically configures OIDC verification and enables masking and session recording out of the box. Register the BigQuery connection in the gateway, specify the columns that must be masked, and set risk thresholds that trigger approval workflows.

For detailed step‑by‑step guidance, see the getting‑started documentation and the broader feature guide at hoop.dev Learn. The open‑source repository contains the full codebase and example configurations.

FAQ

Does hoop.dev replace the need for IAM policies on BigQuery?

No. IAM policies still define the baseline permissions for the service principal. hoop.dev adds runtime enforcement, recording, masking, and approval, on top of those static permissions.

Can I use hoop.dev with other AI models besides the built‑in MCP server?

Yes. Any AI workload that can speak the BigQuery protocol through a standard client can route its traffic through the gateway, regardless of the model or framework.

How long are the session logs retained?

Retention is configurable in the gateway settings. Choose a period that aligns with your organization’s data‑retention policy and SOC 2 audit schedule.

Explore the open‑source implementation and contribute to the project on GitHub.