All posts

PHI for Structured Output: A Compliance Guide

Many teams believe that simply encrypting a file that contains PHI satisfies every regulator’s demand for protection. In reality, encryption alone does not demonstrate who accessed the data, what was returned, or whether a privileged query was approved. Regulators such as the U.S. Department of Health & Human Services expect more than a locked container. They require evidence that auditors can audit the exact data returned, that every request for PHI was authorized, and that the requestor’s ide

Free White Paper

LLM Output Filtering: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

Many teams believe that simply encrypting a file that contains PHI satisfies every regulator’s demand for protection. In reality, encryption alone does not demonstrate who accessed the data, what was returned, or whether a privileged query was approved.

Regulators such as the U.S. Department of Health & Human Services expect more than a locked container. They require evidence that auditors can audit the exact data returned, that every request for PHI was authorized, and that the requestor’s identity was verified at the moment of access.

When output is structured, JSON, CSV, or HL7 messages, those expectations become even stricter because the format allows individual inspection of each field for sensitive content.

Compliance programs therefore look for three core artifacts: a record of the identity that initiated the request, a log of the query or command that produced the output, and a guarantee that the system masks any PHI in the response or releases it only after explicit approval.

Without a unified control point, organizations end up stitching together separate identity providers, logging agents, and ad‑hoc masking scripts, which leaves gaps that auditors can flag.

Even when a company deploys strong identity federation (OIDC, SAML) and least‑privilege service accounts, those pieces only decide *who* may start a connection. They do not enforce *what* that connection can do, nor do they capture the exact data that flows through it. The enforcement must occur where the data actually travels.

What regulators expect for PHI in structured output

Regulatory frameworks define PHI as any individually identifiable health information. When such data is emitted as structured output, the following controls are typically required:

  • Identity verification at request time, with evidence that the user or service account possessed the necessary role.
  • Just‑in‑time (JIT) approval for any operation that could return PHI, ensuring a human reviewer signs off before the data leaves the system.
  • Field‑level masking for any PHI that is not needed for the downstream consumer, applied in real time as the response is generated.
  • Immutable session recording that captures the full request and response payload, enabling replay for audit or forensic analysis.
  • A single, tamper‑evident audit trail that ties the request, approval, masking decision, and session record together under one trustworthy record.

The component that sits in the data path must generate each of these artifacts, otherwise the organization cannot prove that the controls were actually applied.

Why typical pipelines fall short

Most data pipelines start with an identity provider that issues a JWT or SAML assertion. The application then connects directly to the database or API using a static credential stored in a secret manager. Logging often runs inside the application process itself, allowing the process to alter the logs if it is compromised. Developers usually add masking as a post‑processing step, after the system fetches the data from the source.

In this model, the application’s own code does not guarantee approval, masking, or session capture. If a developer forgets to call the masking library, or if an attacker injects code that bypasses the approval check, the audit trail becomes incomplete and PHI may be exposed.

Furthermore, because the database or service receives the raw request, it cannot enforce field‑level policies on its own. The responsibility spreads across multiple layers, making it difficult for auditors to verify that every step was performed consistently.

Continue reading? Get the full guide.

LLM Output Filtering: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

How a data‑path gateway solves the problem

Placing a Layer 7 gateway between the identity layer and the target resource creates a single enforcement point. The gateway verifies the user’s token, checks whether the requested operation is allowed, routes risky queries to an approval workflow, masks PHI fields in the response, and records the entire session for later replay.

Because the gateway is the only component that can see the clear‑text data, the gateway becomes the only place where the required controls can be reliably applied.

In this architecture, the setup phase, defining OIDC clients, assigning roles, provisioning service accounts, decides who may initiate a connection, but it does not enforce what happens after the connection is opened. The gateway, sitting in the data path, is the sole mechanism that can enforce JIT approval, inline masking, and immutable session recording.

Implementing PHI‑safe structured output with hoop.dev

hoop.dev provides the exact data‑path gateway described above. It sits between identities and infrastructure such as databases, SSH, or HTTP APIs.

When a request arrives, hoop.dev validates the OIDC or SAML token, extracts group membership, and determines whether the operation is permitted. If the request could return PHI, hoop.dev triggers a just‑in‑time approval workflow before the query reaches the target.

Once approval is granted, hoop.dev forwards the request to the backend. As the response streams back, hoop.dev inspects each record and masks any PHI fields that are not required for the downstream consumer.

hoop.dev applies the masking in real time, so no unmasked PHI ever leaves the gateway.

Simultaneously, hoop.dev records the full session, including the original request, the approval decision, the masked response, and timestamps, in a single, tamper‑evident audit trail that auditors can query to prove compliance.

Because hoop.dev is the only component that sees the clear‑text data, hoop.dev creates the enforcement outcomes required for PHI compliance.

To get started, follow the getting‑started guide and review the feature documentation for details on configuring PHI masking policies, approval workflows, and session replay. The open‑source repository contains the full implementation and example configurations.

Explore the hoop.dev codebase on GitHub to see how the gateway integrates with your existing identity provider and data sources.

FAQ

Does hoop.dev replace my existing identity provider?

No. hoop.dev consumes tokens from your IdP (OIDC, SAML) to verify who is making the request. It adds enforcement at the data‑path layer, but it does not manage user accounts.

Can I use hoop.dev with any database that stores PHI?

hoop.dev supports a range of database connectors, including PostgreSQL and MySQL, which are common stores for PHI. The gateway applies masking and approval regardless of the underlying database.

How does hoop.dev help with audit readiness?

hoop.dev generates an immutable session record for every request that touches PHI. The record includes the requestor’s identity, the approval event, the masked output, and timestamps, providing the evidence auditors require for PHI compliance.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts