An engineering team recently added an AI coding assistant to their CI pipeline. The assistant writes Snowflake queries on behalf of developers, pulling column names from schema introspection and inserting sample data for test runs. Because the pipeline runs under a shared service account, the assistant inherits full read‑write privileges on the data warehouse. This scenario illustrates a classic data exfiltration risk, where automated code silently moves sensitive records out of Snowflake. When a developer pushes a change, the AI automatically executes the generated query and stores the result in a temporary table. The next job in the pipeline extracts that table and ships the CSV to an external storage bucket that is not covered by the same retention policy. The organization discovers weeks later that a large portion of customer PII has been copied to an uncontrolled location.
Data exfiltration via AI coding agents
AI coding agents excel at generating code quickly, but they also inherit whatever permissions the underlying credential provides. When those credentials are overly permissive, the agent can read entire tables, export them, or even drop data. Because the agent operates programmatically, the activity blends in with normal batch jobs, making it hard for a human reviewer to spot the unusual data movement. The risk is amplified when the agent is granted access to a Snowflake account that stores regulated or personally identifiable information.
Why traditional controls fall short
Most teams rely on three layers of protection: identity federation, role‑based access control inside Snowflake, and audit logging. Identity federation (the setup) ensures that only authenticated identities can request a token. Snowflake roles (the setup) limit what each identity can do. Audit logs (the setup) record who ran which query. However, these controls assume that the request reaches Snowflake directly from the identity holder. In practice the AI agent uses a static service account token that bypasses any real‑time approval step. The request still travels straight to Snowflake, so there is no point where the request can be inspected, masked, or blocked based on its content. The audit log captures the fact that a query ran, but it does not prevent the query from running, nor does it hide sensitive columns in the response.
Placing a gateway in the data path
To close the gap, the enforcement point must sit in the data path, between the identity holder and Snowflake. hoop.dev provides a Layer 7 gateway that proxies every Snowflake connection. By routing traffic through hoop.dev, the organization gains a single place where policy can be applied to the actual query and its result set. hoop.dev verifies the OIDC token, maps group membership to Snowflake roles, and then inspects each SQL statement before it reaches the database. If a statement attempts to export more rows than allowed, hoop.dev can pause the request for a human approver. If a result set contains columns marked as sensitive, hoop.dev masks those fields in real time. Every session is recorded for replay, giving investigators a complete picture of what the AI agent did.
