All posts

Session recording for autonomous agents on Snowflake

Session recording provides a complete immutable record of every query an autonomous data‑pipeline runs against Snowflake. The organization expects that record to become the single source of truth for compliance auditors, a priceless debugging aid when a downstream report looks wrong, and a deterrent against accidental data exfiltration. In an ideal world the pipeline’s output can be traced back to the exact statement, the identity that triggered it, and the time it ran, without having to add cus

Free White Paper

SSH Session Recording + Single Sign-On (SSO): The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

Session recording provides a complete immutable record of every query an autonomous data‑pipeline runs against Snowflake. The organization expects that record to become the single source of truth for compliance auditors, a priceless debugging aid when a downstream report looks wrong, and a deterrent against accidental data exfiltration. In an ideal world the pipeline’s output can be traced back to the exact statement, the identity that triggered it, and the time it ran, without having to add custom logging inside the code.

In practice, many teams hand the pipeline a static Snowflake user or service account. The credentials are baked into CI jobs, stored in secret managers, or even written to configuration files that developers edit by hand. The pipeline connects directly to Snowflake, bypasses any central policy enforcement, and leaves no audit trail beyond Snowflake’s own query history, which often lacks context about the originating process. When a rogue query slips through, the damage is discovered only after the fact, and the forensic work required to reconstruct the event can be prohibitive.

What teams really need is a way to capture a full session record for every autonomous interaction while still allowing the pipeline to reach Snowflake directly. The requirement is simple: the request must travel through a point where it can be observed and logged, but the existing setup provides no such observation point. Without an intervening gateway, the pipeline’s traffic remains invisible to the organization’s governance layer, and the promised session recording never materialises.

hoop.dev fulfills that missing piece. It sits in the data path as an identity‑aware proxy that terminates the client connection, inspects the Snowflake wire protocol, and forwards the request to the target. Because the gateway is the only place the traffic passes, hoop.dev can record every statement, the exact response payload, and the identity that initiated the call. The recorded session is stored outside the pipeline’s runtime, and can be replayed at any time.

Why session recording matters for autonomous agents

Autonomous agents operate without human oversight, often scaling to thousands of queries per hour. Without a reliable audit log, a single mis‑configured transformation can propagate bad data across the warehouse, trigger cost overruns, or violate data‑privacy regulations. Session recording provides three concrete benefits:

  • Compliance evidence: Regulators ask for who accessed what data and when. A recorded Snowflake session shows the exact SQL, the user context, and timestamps, satisfying audit requirements for standards such as SOC 2.
  • Root‑cause analysis: When a downstream dashboard shows unexpected numbers, engineers can replay the exact query that produced the data, see the parameters used, and identify logic errors.
  • Security forensics: If a compromised service account attempts to dump large tables, the session log captures the activity before any damage spreads, enabling rapid containment.

Session recording architecture with hoop.dev

In the hoop.dev model, the gateway is deployed as a container or Kubernetes pod inside the same network segment as Snowflake’s private endpoint. An agent runs alongside the Snowflake instance (or in a reachable subnet) and holds the Snowflake service credentials. When an autonomous job initiates a connection, it authenticates to hoop.dev via OIDC or SAML. hoop.dev validates the token, extracts group membership, and decides whether the request is allowed.

Once authorized, hoop.dev opens a session to Snowflake using its own credential. Because the gateway is the sole conduit, it can mirror every request and response. The mirror is written to a secure log store that is independent of the Snowflake cluster, ensuring that even if the Snowflake account is compromised, the session record remains intact.

Continue reading? Get the full guide.

SSH Session Recording + Single Sign-On (SSO): Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

How hoop.dev captures Snowflake traffic

Snowflake communicates over a proprietary wire protocol that carries SQL statements and result sets. hoop.dev parses this protocol at Layer 7, extracts the raw SQL string, and records metadata such as the user ID, client IP, and execution time. The response payload is streamed through the gateway, allowing hoop.dev to capture the exact rows returned, while still forwarding them to the client in real time. The recording process adds negligible latency because it operates in‑process with the data flow.

Beyond raw logs: replay and forensic analysis

Because the gateway stores the full request‑response pair, engineers can later replay a session against a test Snowflake environment. This replay reproduces the exact data state the autonomous agent saw, making it possible to verify bug fixes or to demonstrate compliance to auditors. The logs also support search by keyword, user, or time range, turning a massive stream of queries into a searchable audit trail.

Implementing session recording for Snowflake

To add session recording, start by deploying the hoop.dev gateway using the official Docker Compose quick‑start or the Helm chart for Kubernetes. The deployment includes OIDC configuration, which ties the gateway to your identity provider. Next, register Snowflake as a connection in the hoop.dev UI, supplying the Snowflake account identifier and letting hoop.dev manage the service credential. Finally, update your autonomous jobs to point their Snowflake client to the hoop.dev endpoint instead of the raw Snowflake host.

The gateway will automatically begin recording every session. Detailed guidance on each step lives in the getting‑started guide and the broader learn section. For teams that prefer to inspect the source, the full repository is available on GitHub.

FAQ

Do I need to change my Snowflake queries?

No. hoop.dev is protocol‑aware and forwards traffic unchanged, so existing SQL statements continue to work without modification.

Will session recording add significant latency?

The recording happens inline with the data flow and is designed to add only minimal overhead, typically measured in milliseconds per query.

Can I disable recording for a specific job?

Recording is enforced at the gateway level. If a job must run without a record, it would need to bypass hoop.dev entirely, which defeats the purpose of centralized governance.

Ready to see the code in action? Explore the hoop.dev repository on GitHub and start protecting your autonomous Snowflake workloads today.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts