Putting access controls around ChatGPT: session recording for AI coding agents

Here is the end-state to aim for: when a ChatGPT agent runs anything against your infrastructure, you can later replay exactly what it did, command by command, attributed to the agent, in a record the agent could never have altered. That is session recording done properly, and most agent setups fall well short of it.

One scope note. hoop.dev does not record what ChatGPT generates. The recording is of the infrastructure commands the agent executes, the SQL, the kubectl call, the shell session, not the model prompt or output.

The end-state, defined

Every infrastructure command the agent runs is captured, not just connection events.
Each session attributes to a named agent identity.
The record lives outside the agent, where it cannot be edited.
Sessions can be exported to your monitoring and evidence stores.

Why the record has to be external

A log inside the agent's runtime is a log the agent can drop or rewrite. For a recording to serve as evidence, it has to accumulate at a point the audited process does not control. That is the requirement, and it dictates where the recording happens: at the gateway the agent connects through, not in the agent.

hoop.dev is an open-source Layer 7 gateway. The ChatGPT agent reaches infrastructure through it, and each session is recorded on the wire under the agent's identity. The recording sits at the protocol layer, so it does not depend on the agent cooperating, logging correctly, or even staying healthy. A crashed agent still leaves a complete record up to the moment it stopped, because the record was never the agent's to keep.

Connection events, not just session boundaries

A recording that only marks when a session opened and closed tells you the agent was active but not what it did, which is the part that matters during an investigation. The end-state is command-level: each statement the agent ran, in order, with its result status and any approval that gated it. That granularity is what lets you answer a precise question precisely, rather than inferring activity from connection timestamps. When you wire the gateway's session events into your monitoring stack, you also get alerting on the patterns you care about, an unusual volume of reads, a command type that should never appear, an access outside the expected window, so the record is not only forensic but operational.

Continue reading? Get the full guide.

AI Session Recording + GCP VPC Service Controls: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Steps to reach the end-state

Run the gateway and an agent near the resources you want recorded.
Register the connections the agent uses, each with a least-privilege credential.
Give the agent a named identity through your IdP.
Enable command-level recording on those connections.
Wire session events to your monitoring stack via webhooks for retention, then run a command and confirm the full record.

# a recorded session, replayable later:
# identity=chatgpt-agent connection=prod-db
#   12:40 SELECT ... 12:42 UPDATE ... (approved by bob) 12:45 disconnect

Pitfalls

Connection-level only. Knowing the agent connected is not knowing what it did. Capture commands.
No external sink. Forward sessions to a store the agent cannot reach so the record survives.
Shared identity. Recording is only as useful as the attribution behind it.

Retention is the last piece teams forget. A recording that exists only on the gateway for a few days is fine for live debugging but useless for a compliance question that arrives months later. Decide up front how long sessions are kept and where, forward them to a durable store, and the end-state holds over the time horizons that audits and investigations actually run on.

FAQ

Does session recording capture ChatGPT prompts?

No. hoop.dev records the infrastructure commands the agent runs, not the model prompt, output, or reasoning.

Can the recordings feed an existing SIEM or monitoring tool?

Yes. Session events can be emitted via webhooks so each session becomes a record in your monitoring or evidence stack.

What is the difference between connection-level and command-level recording?

Connection-level recording tells you the agent opened and closed a session. Command-level recording tells you every statement it ran in between, in order, with results. Only the second answers the questions an incident or audit actually raises, which is why the end-state is command-level.

Reach the end-state with the open-source project on GitHub. The getting started guide covers the first connection, and hoop.dev learn explains what each record contains.