June 18, 20264 min read

How to Implement IAM for Self-Hosted Models

A former contractor still holds a hard‑coded API key that grants unrestricted access to your internal model serving endpoint. The key was never rotated, and the contractor’s Slack account was deactivated weeks ago. Yet the token can still be used to query the model, retrieve proprietary prompts, and even trigger batch inference jobs. This is a textbook example of how many teams protect self‑hosted machine‑learning models today. The model server is typically exposed behind a simple HTTP or gRPC

Free White Paper

Right to Erasure Implementation + Self-Service Access Portals: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Coleman Nye

This is a textbook example of how many teams protect self‑hosted machine‑learning models today. The model server is typically exposed behind a simple HTTP or gRPC listener. Access is granted via static credentials, service accounts, long‑lived API keys, or overly broad IAM roles. Those credentials are checked by the application code, not by a centralized policy engine. Because the request travels directly to the model process, there is no independent audit trail, no real‑time data masking, and no opportunity to require a human approval before a risky operation runs.

Relying on static secrets creates a cascade of risks. If a key leaks, an attacker can enumerate model versions, exfiltrate training data, or launch denial‑of‑service attacks that impact downstream services. Shared credentials also make it impossible to answer “who did what” questions during a post‑mortem, which is a red flag for any compliance audit. Moreover, over‑privileged roles give any compromised service account more blast radius than necessary.

Why IAM matters for self‑hosted models

Identity‑and‑Access Management (IAM) is the practice of granting the right identity the minimum set of permissions needed to perform a specific action, at the right time, and for the right reason. For self‑hosted models, IAM should enforce:

Per‑user or per‑service identity verification via OIDC/SAML tokens.
Fine‑grained policies that limit which model endpoints can be called, what payload sizes are allowed, and which output fields are visible.
Just‑in‑time approvals for high‑impact operations such as bulk inference or model re‑training triggers.
Comprehensive session recording for later replay and audit.
Inline masking of sensitive fields in model responses (e.g., personally identifiable information).

Even with a solid IAM strategy, the enforcement point matters. If the policy check lives inside the model server’s code, a compromised binary can bypass it. The request still reaches the model directly, so there is no guaranteed point where masking or approval can be applied.

hoop.dev as the data‑path enforcement layer

Enter hoop.dev. It is a Layer 7 gateway that sits between every identity (engineer, CI job, AI agent) and the self‑hosted model endpoint. The gateway verifies OIDC or SAML tokens, reads group membership, and then decides whether the request is allowed to proceed. Because hoop.dev is the only place the traffic passes, it becomes the exclusive location where enforcement can happen.

In practice, you deploy the gateway alongside your model servers, often via Docker Compose for a quick start or via Kubernetes for production. An agent runs inside the same network segment as the model, holding the credentials that the model service needs. Users never see those credentials; they only present their identity token to hoop.dev.

Continue reading? Get the full guide.

Right to Erasure Implementation + Self-Service Access Portals: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Setup: identity and provisioning

The first step is to configure an OIDC or SAML identity provider (Okta, Azure AD, Google Workspace, etc.) as the source of truth for who can request access. You then map provider groups to hoop.dev policies that express the fine‑grained permissions described earlier. This mapping is the “setup” phase: it decides who the request is and whether it may even start, but it does not enforce any guardrails on its own.

Data‑path enforcement

When a request arrives, hoop.dev examines the protocol payload. It can:

Block commands that exceed defined limits (e.g., batch size thresholds).
Route high‑risk calls to a human approver before they reach the model.
Mask fields that match privacy rules in the model’s response.
Record the entire session for replay, creating a reliable audit trail.

All of these outcomes exist because hoop.dev sits in the data path. Without the gateway, the model server alone cannot guarantee that a request was approved, that sensitive data was redacted, or that the session was captured.

Common mistakes to avoid

Relying on long‑lived static credentials. Even with strict IAM policies, a static key can be reused indefinitely. Use short‑lived tokens issued by your identity provider and let hoop.dev enforce their expiration.
Embedding policy logic in the model application. Placing guards inside the service makes them vulnerable to tampering. Keep enforcement at the gateway where the agent cannot modify it.
Neglecting group‑to‑policy mapping. If you grant broad group membership without precise policy definitions, you lose the “least privilege” benefit.
Assuming audit logs are safe without a gateway. The model process may rotate logs or truncate them. hoop.dev records sessions outside the model, preserving evidence.
Skipping approval workflows for high‑impact actions. Operations like bulk inference or model version changes should trigger a JIT approval step, which hoop.dev can enforce automatically.

Best practices for IAM with self‑hosted models

Configure your identity provider to issue short‑lived tokens and enforce MFA for privileged groups.
Map each group to a specific hoop.dev policy that defines allowed endpoints, payload limits, and masking rules.
Enable session recording for every connection; store the logs in a secure location for compliance.
Use hoop.dev’s built‑in approval workflow for any request that exceeds a risk threshold.
Regularly review and rotate any static credentials that might still exist outside the gateway.

These steps ensure that IAM is not just a paperwork exercise but an enforceable control that lives where the traffic actually flows.

Getting started

To see the full implementation details, follow the getting‑started guide. It walks you through deploying the gateway, connecting it to an OIDC provider, and defining policies for a self‑hosted model service. For deeper insight into policy language, masking options, and approval workflows, explore the learn section of the documentation.

The entire solution is open source. You can review, fork, or contribute to the codebase on GitHub: hoop.dev repository.

FAQ

Is hoop.dev compatible with any OIDC provider?

Yes. hoop.dev acts as a relying party and can validate tokens from any OIDC or SAML identity provider that supports standard JWT claims and group information.

Do I need to change my existing model code?

No. The gateway intercepts traffic at the protocol layer, so your model server continues to listen on its native port and protocol. The only change is to point clients at the hoop.dev endpoint instead of the raw model address.

How does hoop.dev protect the credentials it stores?

Credentials are kept inside the network‑resident agent that runs alongside the model. They never leave the internal network and are never exposed to end users or client applications.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts