All posts

AI judging AI: real-time risk classification on every command

Hoop’s AI Session Analyzer runs every command and query through your chosen LLM before execution. Three risk levels. Three policy actions. The dangerous ones never reach the database. Hoop is an open-source gateway that sits between engineers, AI agents, and infrastructure. The AI Session Analyzer is a runtime risk classifier that judges every command flowing through Hoop. Each input is sent to your configured LLM (OpenAI, Anthropic, Azure OpenAI, or any custom provider), classified as Low, Med

Free White Paper

AI Risk Assessment + Real-Time Session Monitoring: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

Hoop’s AI Session Analyzer runs every command and query through your chosen LLM before execution. Three risk levels. Three policy actions. The dangerous ones never reach the database.

Hoop is an open-source gateway that sits between engineers, AI agents, and infrastructure. The AI Session Analyzer is a runtime risk classifier that judges every command flowing through Hoop. Each input is sent to your configured LLM (OpenAI, Anthropic, Azure OpenAI, or any custom provider), classified as Low, Medium, or High risk, and matched against the policy you defined for that connection. The dangerous ones can be blocked before they ever reach production.

Why it matters. The same AI agents you’ve handed a database connection to can now write SQL, shell scripts, and infrastructure changes faster than any human reviewer. Static rules (regex, denylists, command parsers) miss intent. The Session Analyzer reads intent because it asks an LLM to.

What is the AI Session Analyzer?

A runtime classifier embedded in Hoop’s audit pipeline. When a user or agent runs a command through any Hoop connection, the input is intercepted on the SessionOpen packet, sent to your configured AI provider, and classified into one of three risk levels via tool calls. Your per-connection rule decides what happens at each level.

Three actions are available per risk level: allow, require approval, or block execution outright. The result (risk level, title, explanation, action taken) is persisted on the session record, fully auditable.

How does it classify risk?

The model is given a system prompt and forced to call exactly one of three tools:

  • LowRiskAISessionAnalyzer — non-destructive, scoped, low operational/security impact
  • MediumRiskAISessionAnalyzer — could cause performance issues, service disruption, sensitive exposure, or risky-but-not-clearly-destructive changes
  • HighRiskAISessionAnalyzer — destructive, irreversible, escalates privileges, exfiltrates data, disables defenses, or resembles exploit/persistence behavior

The tool call returns a short title and explanation. That output is what your audit log shows, what your reviewers see, and what your team learns from.

The “AI calls a tool” pattern is deliberate. Free-text classification is unreliable; tool selection is structured, testable, and auditable.

How is this different from regex-based command filtering?

Most access management platforms ship a denylist of dangerous commands or a regex filter against the query string. That approach has a known failure mode: any command not in the list passes. DROP TABLE users is blocked. WITH t AS (SELECT * FROM users) DELETE FROM t WHERE 1=1 is not.

The Session Analyzer reads intent. The model classifies what the command is trying to do, not what it literally says. A subquery wrapping a delete still gets flagged as High risk, because the model understands the operation. A SELECT against a sensitive table can be flagged Medium because the model sees the table name. Static rules cannot do either.

How does it integrate with approvals?

The Medium-risk action can be set to “require approval.” When that happens, the Session Analyzer’s risk title and explanation are attached to the approval request. Your reviewer doesn’t just see the command — they see why an LLM thinks it’s risky. Reviews are decided faster because the reasoning is already there.

This means the Session Analyzer is not just a gate. It is a triage layer that pre-explains the decision your humans need to make.

Continue reading? Get the full guide.

AI Risk Assessment + Real-Time Session Monitoring: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

What ships in v1

  • Provider configuration at /api/ai/session-analyzer/providers. One provider per organization. Supports OpenAI, Anthropic, Azure OpenAI, and any OpenAI-compatible custom endpoint.
  • Per-connection rules at /api/ai/session-analyzer/rules. Each rule maps to one or more connections and specifies the action for each risk level (allow, approve, block).
  • Runtime hook in the audit pipeline. Every exec command on a configured connection is analyzed before execution. Blocked sessions exit cleanly with a clear error.
  • Persistence on session records. Every session gets an ai_analysis field with risk level, title, explanation, and the action taken. Visible in the session list, the session detail view, and the audit export.
  • Analytics events. hoop-session-ai-analysis-rule-created, hoop-session-ai-analysis-provider-updated, and per-session usage data so you can measure adoption.

How do you turn it on?

Three steps.

1. Configure a provider.

curl -X POST https:///api/ai/session-analyzer/providers \  -H "Authorization: Bearer " \  -H "Content-Type: application/json" \  -d '{    "provider": "anthropic",    "model": "claude-sonnet-4-5",    "api_key": ""  }'

2. Create a rule for a connection.

curl -X POST https:///api/ai/session-analyzer/rules \  -H "Authorization: Bearer " \  -H "Content-Type: application/json" \  -d '{    "name": "prod-postgres-policy",    "connection_names": ["prod-postgres"],    "risk_evaluation": {      "low_risk_action": "allow",      "medium_risk_action": "require_approval",      "high_risk_action": "block"    }  }'

3. Use the connection. Every exec command through prod-postgres is now analyzed in real time. High-risk inputs are blocked before they execute. Medium-risk inputs go to your reviewers with the LLM’s reasoning attached. Low-risk inputs pass through.

Why now

Two reasons.

One: agentic systems are landing in production faster than approval workflows can keep up. Teams want a layer that scales with the volume — and humans alone do not. An LLM classifier reads every command in milliseconds.

Two: the cost of being wrong is asymmetric. Blocking a safe query is annoying. Letting through a DELETE against the wrong table is a postmortem. Tilting the system toward “ask the LLM, fall back to human approval” matches the asymmetry.

Operational value

You stop writing regex denylists. You stop maintaining command parsers per database engine. You configure one rule per connection and the model handles the long tail of inputs you would never have predicted.

For your reviewers, the approval queue gets smarter. Each pending request comes pre-explained: what the command is doing, why it might be risky, what tier the model thinks it falls into. Average review time goes down.

Strategic value

You get an auditable, model-driven reasoning layer in front of every production action. Your security team can review the LLM’s classifications across thousands of sessions and see patterns: which connections produce the most High-risk inputs, which users trigger Medium-risk classifications most often, which command types the model flags repeatedly.

You also get model portability. Swap providers when a better one ships. The interface is stable. The classifications stay consistent because the system prompt and tool schema do.

What’s next

This week’s release is the foundation. Two pieces are already in flight for May.

A native MCP admin server so any AI agent can manage Hoop resources through the same auth and audit pipeline as a human. User-facing MCP tools so any developer’s agent can run queries and hit approval gates the same way humans do. Both compose with the Session Analyzer: an agent’s commands also get classified, explained, and gated.

Try it

Hoop is open source under MIT. Free for small teams.

curl -sL https://hoop.dev/docker-compose.yml > docker-compose.yml && docker compose up

If you cannot explain why your AI agent’s last command was safe to run, the Session Analyzer can.

Get started

See hoop.dev in action

One gateway for every database, container, and AI agent. Deploy in minutes.

Get a demoMore posts