All posts

Putting access controls around Devin: data masking for AI coding agents (on Snowflake)

Devin, an AI‑driven coding assistant, has been granted a service‑account credential that lets it run arbitrary SQL against a Snowflake warehouse. Because the assistant can retrieve any column, the organization must rely on data masking to prevent raw PII from reaching the model. When a developer asks Devin to generate a query that joins customer orders with payment information, the assistant returns raw rows that include credit‑card numbers and Social Security numbers. Those values are then cach

Free White Paper

Snowflake Access Control + AI Model Access Control: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

Devin, an AI‑driven coding assistant, has been granted a service‑account credential that lets it run arbitrary SQL against a Snowflake warehouse. Because the assistant can retrieve any column, the organization must rely on data masking to prevent raw PII from reaching the model. When a developer asks Devin to generate a query that joins customer orders with payment information, the assistant returns raw rows that include credit‑card numbers and Social Security numbers. Those values are then cached in logs, fed to downstream pipelines, and potentially exposed to anyone who can read the assistant’s output.

In many organizations the quickest way to give an AI agent access is to create a static Snowflake user, assign it a broad read role, and embed the password in the agent’s configuration. The approach is attractive because it avoids the friction of per‑request approvals, but it also means the agent can see every column it is technically allowed to read. There is no guarantee that sensitive fields are filtered, no audit of which rows were fetched, and no way to stop the agent from exporting the data.

Regulatory frameworks and internal privacy policies often require that personally identifiable information (PII) be hidden from non‑human processes unless an explicit business need is documented. Data masking reduces the blast radius of a leak, limits the exposure of regulated data, and helps keep downstream systems from unintentionally storing raw identifiers.

Teams sometimes try to mitigate the risk by creating a read‑only role or by limiting the agent to a single schema. Those steps are useful, but they do not address the core problem: the AI agent still receives raw column values and there is no record of what was returned. Without a point in the traffic flow where policies can be enforced, masking remains a manual, error‑prone process.

The missing piece is an identity‑aware proxy that sits in the data path between the AI agent and Snowflake. The proxy authenticates the agent using a non‑human identity, holds the Snowflake credential internally, and applies policy rules to every response before the data reaches the agent. This design satisfies two prerequisites: a least‑privilege identity for the request, and a gateway that can inspect and transform protocol traffic.

hoop.dev provides exactly that gateway. It runs as a Layer 7 proxy that terminates the Snowflake wire protocol, validates the OIDC token presented by Devin, and then forwards the request to Snowflake using its own stored credential. Because the proxy is the only point where traffic passes, hoop.dev can mask configured columns in real time, record the full query and result set for replay, and require a human approver for queries that match a high‑risk pattern. The masking, session recording, and approval workflow exist only because hoop.dev sits in the data path; without it the Snowflake connection would remain unfiltered.

To use hoop.dev with Snowflake, an administrator registers Snowflake as a connection in the gateway’s catalog, supplies the service‑account key that the gateway will use, and defines masking rules that specify which columns (for example, credit_card_number or ssn) should be redacted. When Devin issues a query, hoop.dev intercepts the response, replaces the protected fields with a placeholder value, and logs the transaction. The agent never sees the raw data, and the organization gains a searchable audit trail that shows who asked for which data and when.

This architecture delivers three concrete enforcement outcomes: first, hoop.dev masks sensitive fields before they reach the AI agent; second, it records every session for later replay or forensic analysis; third, it can pause a query that matches a policy and request a just‑in‑time approval from a designated reviewer. Because the gateway holds the Snowflake credential, engineers never need to embed secrets in the agent, and the risk of credential leakage is dramatically reduced.

Continue reading? Get the full guide.

Snowflake Access Control + AI Model Access Control: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Getting started is straightforward. Follow the getting‑started guide to deploy the gateway, register Snowflake as a target, and configure masking policies through the web UI. The full source code and deployment manifests are available on GitHub. Detailed feature documentation can be explored on the learn page.

Why data masking matters for AI coding agents

AI assistants operate at speed and scale. A single query that returns thousands of rows can instantly populate a model’s context, and any exposed PII can be propagated across logs, caches, and downstream analytics pipelines. Masking ensures that only the necessary data reaches the agent, keeping regulated fields out of the model’s training set and preventing accidental data exfiltration.

Architecting a gateway in the data path

The gateway must be the only component that can see the raw Snowflake traffic. It authenticates the request, enforces role‑based access, applies transformation rules, and forwards the sanitized payload. By placing the enforcement point at the protocol layer, the organization avoids reliance on client‑side filters, which can be bypassed or mis‑configured.

hoop.dev as the enforcement layer

hoop.dev is built to operate as that enforcement layer. It validates OIDC tokens, holds the Snowflake credential, and injects masking logic directly into the response stream. Because the gateway records every session, compliance teams can retrieve a replay of any query, complete with timestamps and user identity.

Putting it together for Snowflake

1. Register Snowflake as a connection in the hoop.dev UI.
2. Upload the service‑account key that the gateway will use to authenticate to Snowflake.
3. Define column‑level masking rules for any PII fields.
4. Enable session recording and, if desired, approval workflows for high‑risk queries.
5. Point Devin at the hoop.dev endpoint instead of the raw Snowflake host.

From that point forward, every query passes through hoop.dev, which masks, logs, and optionally pauses the request for approval. The AI agent never sees unmasked data, and the organization retains a complete audit trail.

Next steps

Review the getting‑started documentation to spin up a test deployment, then experiment with masking policies on a sandbox Snowflake warehouse. When you are ready, promote the configuration to production and enable just‑in‑time approvals for any query that touches regulated columns.

FAQ

Does hoop.dev store Snowflake credentials? The gateway holds the credential in memory and never exposes it to the AI agent or end‑user. Access to the credential is limited to the gateway process.

Can I see which queries were run by Devin? Yes. hoop.dev records each session, including the raw query, the masked result set, and the identity of the requester. The logs are searchable and can be exported for compliance reviews.

What happens if a query matches a high‑risk pattern? hoop.dev can pause the request and route it to an approver defined in the policy. The approver can grant or deny the operation in real time, after which the gateway either forwards the query or returns an error.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts