Lateral movement across a network of self‑hosted AI models can let an attacker pivot from a compromised service to the core inference engine and exfiltrate proprietary weights.
Why the current approach leaves models exposed
Most teams spin up a model server on a VM or container, expose a REST endpoint, and give engineers a shared SSH key or static service account token to reach it. The credential is stored in a configuration file, copied between developers, and often checked into version control. Network policies are coarse – the model port is open to the entire internal subnet, and no one records which command or query triggered a response. When a breach occurs, the only evidence is a handful of system logs that do not show who queried the model, what payload was sent, or whether the response contained sensitive data. The result is a blind spot that enables lateral movement without detection.
What needs to change before we can claim safety
We must enforce identity‑aware access at the point where traffic reaches the model, but the request still travels directly to the model process. In other words, we need a gate that can verify the caller, apply just‑in‑time approvals, mask sensitive fields in the model’s output, and record the entire session. The gate must sit in the data path; otherwise the model itself remains the only place enforcement could happen, and an attacker who compromises the model process could simply bypass any local checks.
Setting up OIDC or SAML authentication, assigning least‑privilege service accounts, and configuring network segmentation are necessary steps. They decide who may start a connection, but they do not provide the runtime enforcement required to stop lateral movement. Without a dedicated data‑path component, the connection still reaches the model unfiltered, and no audit trail is generated.
How hoop.dev provides the missing control layer
hoop.dev is a layer‑7 gateway that sits between identities and the model server. It proxies the HTTP or gRPC traffic, inspects each request, and applies policy before the payload reaches the model. Because hoop.dev is the only point that can see the full request and response, it can enforce three critical outcomes:
- Session recording: hoop.dev captures every query and response, creating a replay that auditors can review to trace lateral movement attempts.
- Inline masking: Sensitive fields such as personally identifiable information or proprietary token strings are redacted in real time, preventing accidental leakage to downstream services.
- Just‑in‑time approval: High‑risk operations – for example, requests that trigger model fine‑tuning or export of weights – are routed to a human approver before execution.
All of these outcomes exist only because hoop.dev occupies the data path. If the gateway were removed, the model would again receive raw traffic without any of the above safeguards.
Setup: identity and least‑privilege grants
Engineers authenticate through an OIDC provider such as Okta or Azure AD. hoop.dev validates the token, extracts group membership, and maps it to fine‑grained policies that define which model endpoints a user may call. Service accounts receive narrowly scoped roles that allow only inference, not model management. This setup ensures that the request is properly identified before it reaches the gateway.
