PHI for Inference: A Compliance Guide

How can you prove that your AI inference pipelines handle PHI without breaking compliance?

Regulators expect every system that touches protected health information to provide an audit trail of who accessed what, when, and why. In practice, many organizations treat inference services as a black box: a model runs behind a load balancer, a service account holds static credentials, and data streams in and out without any visibility. The result is a compliance gap that is hard to close during an audit because the evidence is either missing or scattered across log aggregators, application logs, and ad‑hoc spreadsheets.

That gap starts with the way teams currently connect to inference endpoints. Engineers often create a single API key or service account that is shared across multiple projects. The key lives in configuration files, CI pipelines, or container images. When a request arrives, the inference server authenticates the static credential, fetches the model, and returns a response. No per‑request identity is recorded, no inline data redaction occurs, and no human approval step exists. If a data‑science experiment inadvertently sends a patient’s name to a model that was never trained to handle identifiers, the leakage goes unnoticed until someone discovers it in a downstream data set.

Even when organizations adopt non‑human identities, OIDC tokens, service‑to‑service SAML assertions, or short‑lived IAM roles, the request still reaches the inference service directly. The gateway that verifies the token does not sit on the data path; it merely decides whether the request may start. Because the enforcement point is outside the traffic flow, the system cannot block a dangerous payload, cannot mask a social‑security number embedded in a JSON payload, and cannot record the exact query that triggered the model. The result is a partial fix: authentication is hardened, but the core compliance controls remain absent.

To bridge that gap, the control surface must be placed where the request actually travels. hoop.dev acts as a Layer 7 identity‑aware proxy that sits between the authenticated identity and the inference endpoint. By routing every inference call through hoop.dev, the gateway becomes the sole place where policy can be enforced.

How hoop.dev creates continuous PHI evidence

When a request reaches hoop.dev, the gateway performs three critical actions that generate audit‑ready evidence for PHI compliance:

Session recording. hoop.dev captures the full request and response payloads, timestamps each interaction, and stores the record in an audit log. The log can be replayed later to answer “who saw which PHI value and when?”
Inline masking. Before the payload is forwarded to the model, hoop.dev can redact or pseudonymize fields that contain PHI, such as patient identifiers or medical record numbers. The original value is never exposed to the inference engine, reducing the risk of accidental leakage.
Just‑in‑time approval. If a request matches a high‑risk pattern, e.g., a bulk query for a full patient cohort, hoop.dev routes the request to an approver. The approver’s decision is logged alongside the request, creating a clear chain of custody.

All of these outcomes are produced because hoop.dev sits in the data path. Without that placement, a token‑validation service could not block a command, could not mask fields, and could not guarantee that a replayable record exists.

Continue reading? Get the full guide.

Inference: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Why continuous evidence matters for auditors

Auditors frequently ask for evidence that satisfies three HIPAA Security Rule requirements: access control, audit controls, and integrity. hoop.dev satisfies each of those by delivering a single source of truth that is generated automatically for every inference request. The audit log includes:

The identity that initiated the request (derived from the OIDC token).
The exact query parameters, with PHI fields masked according to policy.
Any approval actions taken, including approver name and timestamp.
The response status and any error codes.

This granularity enables auditors to trace the flow of PHI from ingestion to model output without having to piece together logs from multiple services. Because hoop.dev is open source, organizations can inspect the code that generates the evidence, satisfying the “open‑source verification” expectation that some compliance programs now encourage.

Getting started with hoop.dev for inference workloads

Deploying hoop.dev does not require rewriting your inference client. The gateway runs as a Docker Compose service for quick evaluation, or as a Kubernetes deployment for production workloads. After deployment, register your inference endpoint as a connection, attach the appropriate credential (for example, a short‑lived IAM role), and enable the PHI‑specific guardrails in the configuration UI.

The getting‑started guide walks you through the steps to provision the gateway, bind it to your OIDC provider, and define masking rules for PHI fields. For deeper details on the masking and approval features, see the learn section of the documentation.

FAQ

Does hoop.dev store PHI itself?

No. hoop.dev records the request metadata and the masked version of the payload. The original PHI never leaves the client before it is redacted, and the stored logs contain only the sanitized data needed for audit.

Can I use hoop.dev with existing inference clients?

Yes. hoop.dev proxies standard protocols such as HTTP/HTTPS, gRPC, and custom TCP streams. Your client simply points to the gateway address instead of the raw model endpoint.

What if I need to revoke access quickly?

Because hoop.dev enforces just‑in‑time policies, you can remove a user’s group membership in the identity provider. The next request will be denied at the gateway before any PHI reaches the model.

Ready to see how continuous PHI evidence can be built into your inference pipeline? Explore the open‑source repository on GitHub: github.com/hoophq/hoop.