All posts

Putting access controls around Cursor: data masking for AI coding agents (on BigQuery)

When Cursor’s code‑generation loops run, any query that returns personal or proprietary fields is automatically redacted before the model ever sees the raw values. The AI can still suggest useful code, but it never leaks confidential data. In many teams today, the simplest way to give Cursor access to analytics data is to hand the agent a service‑account key that has read‑only permissions on a BigQuery dataset. The agent talks directly to BigQuery, pulls rows, and feeds them back into the LLM.

Free White Paper

Cursor / AI IDE Security + Data Masking (Static): The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

When Cursor’s code‑generation loops run, any query that returns personal or proprietary fields is automatically redacted before the model ever sees the raw values. The AI can still suggest useful code, but it never leaks confidential data.

In many teams today, the simplest way to give Cursor access to analytics data is to hand the agent a service‑account key that has read‑only permissions on a BigQuery dataset. The agent talks directly to BigQuery, pulls rows, and feeds them back into the LLM. That approach works, but it leaves two critical gaps, and data masking is missing. First, the raw response travels in clear text from BigQuery to the AI, meaning any personally identifiable information (PII) or trade secrets are exposed to the model’s training pipeline. Second, because the connection bypasses any audit layer, there is no record of which queries were run, who triggered them, or whether the results were ever reviewed.

Data masking is the practice of stripping or transforming sensitive fields in a response before they reach an untrusted consumer. For AI coding agents, masking prevents the model from ingesting data it should not learn from, reduces the risk of accidental data leakage, and helps organizations meet regulatory expectations around data minimisation. Without a reliable masking point, teams rely on ad‑hoc client‑side filters that can be bypassed, or they accept the risk of exposing raw data.

The immediate fix is to place a guard at the point where the query result leaves BigQuery. The guard must be able to read the response, apply column‑level or pattern‑based redaction, and then forward the sanitized payload to the AI. Importantly, the request still travels directly to BigQuery – the identity that authorises the query (often an OIDC‑derived service account) does not change, and the network path to the data store remains unchanged. The guard does not replace the authentication step; it merely sits in the data path to enforce masking.

Why hoop.dev is the only place data masking can be guaranteed

hoop.dev is a Layer 7 gateway that proxies connections between identities and infrastructure. By deploying hoop.dev between Cursor and BigQuery, every query and its result flow through a single, controllable point. hoop.dev holds the BigQuery credentials, so the AI never sees a secret. While the request passes through the gateway, hoop.dev inspects the wire‑protocol payload, matches configured masking rules, and rewrites the response before it reaches Cursor. Because hoop.dev is the active subject of the operation, the masking outcome exists only because hoop.dev sits in the data path.

How masking policies are enforced

Administrators define masking rules in hoop.dev’s policy language – for example, “redact any column named email” or “replace credit‑card numbers with a hash”. When a BigQuery response arrives, hoop.dev parses the result set, applies the rules, and returns a sanitized version. The AI receives only the allowed columns and masked values, ensuring that sensitive data never enters the model’s context.

Additional guardrails that come for free

  • hoop.dev records every session, providing an audit trail of who queried what and when.
  • Just‑in‑time access can be required, so a human approves a query before it executes.
  • Command‑level blocking can prevent dangerous DDL statements from running.

These outcomes are all possible because hoop.dev is the gateway that sees the traffic; they are not provided by the underlying identity provider or by BigQuery itself.

Continue reading? Get the full guide.

Cursor / AI IDE Security + Data Masking (Static): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

High‑level steps to enable data masking for Cursor

1. Deploy the hoop.dev gateway in the same network segment as your BigQuery instance. The quick‑start guide walks you through a Docker‑Compose deployment that includes OIDC authentication, masking, and session recording out of the box.

2. Register BigQuery as a connection in hoop.dev. During registration you supply the host, project, and the credential that hoop.dev will use to talk to BigQuery. The credential never leaves the gateway.

3. Define masking policies that target the columns or patterns you consider sensitive. Policies are stored in hoop.dev and can be scoped to groups, so only certain teams see unmasked data while others always receive redacted results.

4. Reconfigure Cursor (or any client library it uses) to point at the hoop.dev endpoint instead of the raw BigQuery address. From the client’s perspective the connection works exactly the same; the only difference is that the traffic now passes through the gateway.

5. Verify the end‑to‑end flow by running a test query that returns known PII. The response you see in Cursor should show the masked version, while the audit log in hoop.dev records the original query and the applied policy.

Common mistakes to avoid

  • Masking on the client side. Applying filters after the data has left BigQuery means the raw values have already been exposed to the AI.
  • Hard‑coding service‑account keys in the AI process. This defeats the purpose of a gateway that keeps credentials secret.
  • Using overly broad OIDC groups. If every engineer belongs to the same group, you lose the ability to apply different masking rules per role.
  • Skipping policy testing. A mis‑typed column name can leave data unmasked; always validate policies against real query results.

By keeping the gateway in the data path and letting hoop.dev enforce the rules, you eliminate these risks.

Next steps

Start with the getting‑started guide to spin up a local gateway, then explore the feature documentation for detailed masking policy examples. When you’re ready to integrate with your production environment, clone the open‑source repository and follow the deployment instructions.

Explore the source code on GitHub to see how the gateway implements protocol‑level inspection and to contribute your own enhancements.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts