All posts

Putting access controls around Claude: data masking for AI coding agents (on Snowflake)

An engineering team recently added Claude, an LLM‑powered coding assistant, to their CI pipeline. The team gave the agent a Snowflake service account that could read every analytics schema. Within minutes the build logs started surfacing raw customer identifiers, credit‑card numbers, and health records that were never meant to leave the data warehouse. The root cause was the lack of data masking on the AI‑driven queries. Most teams solve the problem by handing the AI a static Snowflake credenti

Free White Paper

Snowflake Access Control + AI Model Access Control: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

An engineering team recently added Claude, an LLM‑powered coding assistant, to their CI pipeline. The team gave the agent a Snowflake service account that could read every analytics schema. Within minutes the build logs started surfacing raw customer identifiers, credit‑card numbers, and health records that were never meant to leave the data warehouse. The root cause was the lack of data masking on the AI‑driven queries.

Most teams solve the problem by handing the AI a static Snowflake credential with broad read permissions. They store the credential in a CI secret store, inject it into the job, and let the agent talk directly to Snowflake over the native wire protocol. No one inspects the traffic, no masking is applied, and the logs contain the exact rows the model returned. When the job fails, the raw data ends up in artifact storage, turning remediation into a forensic nightmare.

That raw exposure is why data masking matters for AI coding agents. Claude can generate code that prints query results, copies data into downstream services, or even embeds sensitive fields in generated documentation. Without a guardrail, the model becomes a conduit for accidental data leakage.

Why data masking alone isn’t enough

Organizations can start by creating a dedicated non‑human identity for Claude and granting it only the schemas it needs. This limits the blast radius, but the request still travels straight to Snowflake. The gateway that the model uses sees the unfiltered response, and there is no audit trail of which rows were accessed or which columns were returned. In other words, the setup fixes credential over‑provisioning but leaves the data path unprotected.

The missing piece is a Layer 7 proxy that sits between the AI and the database, capable of inspecting and transforming the protocol payload. Only a component that controls the data path can reliably apply inline data masking, record the session, and enforce just‑in‑time approvals for risky queries.

hoop.dev as the data‑path enforcement point

hoop.dev provides exactly that. It acts as a Layer 7 gateway that proxies Snowflake connections. When Claude initiates a query, the request reaches hoop.dev instead of Snowflake directly. hoop.dev holds the Snowflake credential, so the AI never sees a secret. Inside the gateway you configure masking rules that specify which columns, such as ssn, credit_card_number, or medical_record_id, must be redacted in any response.

On every response, hoop.dev inspects the result set, replaces the protected fields with a placeholder, and forwards the sanitized payload back to Claude. Because hoop.dev sits on the data path, the masking happens in real time, before the model can ever render the data. At the same time, hoop.dev records the entire session, including the original query, the masked result, and the identity of the caller. Those logs become audit evidence for any downstream compliance review.

Continue reading? Get the full guide.

Snowflake Access Control + AI Model Access Control: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Beyond masking, hoop.dev can require a human approval step for queries that match a risk pattern, such as a SELECT that scans an entire table of PII. The request pauses, routes to an approver, and executes only after the approval is granted. This just‑in‑time workflow prevents accidental bulk exfiltration.

Getting started with hoop.dev and Claude

1. Deploy the hoop.dev gateway using the provided Docker Compose quick‑start. The compose file launches the gateway and an agent that runs inside the same network as Snowflake.

2. Register Snowflake as a connection in the gateway UI or via the API. Provide the Snowflake account identifier and the service‑account credentials that Claude will use. hoop.dev stores these credentials securely; Claude never sees them.

3. Define data‑masking policies in the gateway configuration. Choose the columns to mask and the placeholder value. hoop.dev evaluates the policies per‑query, so you can have fine‑grained rules for different schemas.

4. Point Claude’s Snowflake client configuration to the gateway endpoint instead of the raw Snowflake host. From Claude’s perspective the connection works exactly the same; the only difference is that every response is filtered through hoop.dev.

5. Enable session recording and, if desired, just‑in‑time approval for high‑risk queries. hoop.dev stores recorded sessions in a separate storage backend that the gateway writes to, allowing later review.

All of these steps are described in the official getting‑started guide and the feature‑by‑feature learning hub. Follow those resources for the exact commands and YAML snippets needed to spin up the gateway and configure masking.

FAQ

  • Does hoop.dev change the Snowflake query language? No. The gateway forwards the original SQL unchanged; it only inspects the response payload to apply masking.
  • Can I mask data conditionally based on the caller? Yes. Masking rules can be scoped to the identity that initiated the request, allowing you to apply stricter redaction for higher‑privilege AI agents.
  • What happens to audit logs if the gateway is compromised? hoop.dev writes the logs to a separate storage backend that the gateway can only write to. The recorded sessions remain available for later review.

For a complete walkthrough, start with the getting‑started documentation and explore the learn portal for deeper details on masking policies and approval workflows.

Ready to see the code and contribute? Visit the hoop.dev GitHub repository to clone the project, review the source, and submit improvements.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts