All posts

DLP for JSON Schema

A newly hired contractor finishes a data‑migration sprint and leaves the company, but the CI pipeline he built still pushes JSON configuration files to production, exposing a dlp gap. The files contain API keys and customer identifiers that never should have been stored in plain text. When a teammate runs the pipeline the next day, the secrets are echoed in the build logs and end up in a shared log aggregation service. The leak is discovered only after an external auditor asks for the raw logs.

Free White Paper

JSON Web Tokens (JWT) + API Schema Validation: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

A newly hired contractor finishes a data‑migration sprint and leaves the company, but the CI pipeline he built still pushes JSON configuration files to production, exposing a dlp gap. The files contain API keys and customer identifiers that never should have been stored in plain text. When a teammate runs the pipeline the next day, the secrets are echoed in the build logs and end up in a shared log aggregation service. The leak is discovered only after an external auditor asks for the raw logs.

This scenario illustrates a common gap. Organizations often invest heavily in identity providers, least‑privilege roles, and token‑based authentication. Those controls decide who can start a request, but they do not inspect the payload that travels over the wire. A request for a database, an API, or a remote command can still carry raw personal data, credit‑card numbers, or internal identifiers without any guardrails.

Why DLP must sit on the data path

Data‑loss‑prevention for JSON payloads requires three things that identity alone cannot provide. First, the system must see the actual JSON document before it reaches the target service. Second, it must compare the document against a schema that marks which fields are sensitive. Third, it must act on that comparison – by masking, by requiring an approval, or by aborting the request entirely. If any of those steps occurs outside the data path, a compromised client or a rogue service can bypass the control.

In practice, teams often rely on application‑level checks or post‑process scans. Those approaches are reactive, can miss transient data, and do not give a reliable audit trail. The enforcement point must be the gateway that proxies the connection, because that is the only place where every byte of the request is observable and controllable.

Introducing hoop.dev as the DLP enforcement layer

hoop.dev is a layer‑7 gateway that sits between identities and infrastructure. It proxies connections to databases, Kubernetes clusters, SSH hosts, and HTTP services. When a client presents an OIDC or SAML token, hoop.dev validates the token, extracts group membership, and then forwards the request through its data‑path component. The gateway inspects the protocol payload, applies inline masking rules based on a JSON schema, routes risky operations to a human approver, and records the entire session for replay.

Continue reading? Get the full guide.

JSON Web Tokens (JWT) + API Schema Validation: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

Because hoop.dev lives in the data path, every enforcement outcome is directly attributable to it. hoop.dev masks sensitive fields before they ever reach the downstream service. hoop.dev blocks commands that do not match the approved JSON schema. hoop.dev routes requests that contain high‑risk identifiers to an approval workflow. hoop.dev records each session, preserves the audit trail, and makes the logs available for compliance reviews.

Practical steps to protect JSON payloads

  • Define a JSON schema that lists all fields that contain personal data, API secrets, or regulated identifiers. Mark each field with a sensitivity level.
  • In hoop.dev’s configuration, map the schema to masking rules. For example, replace credit‑card numbers with a tokenized placeholder, redact email addresses, or hash internal IDs.
  • Enable just‑in‑time access for the service that consumes the JSON. Users receive a short‑lived token that the gateway validates on each request.
  • Configure an approval workflow for any request that includes fields marked as high‑risk. The gateway pauses the request, notifies the designated reviewer, and only forwards the payload after explicit consent.
  • Turn on session recording. hoop.dev stores a replayable log of the request and the masking actions applied, giving auditors a complete picture of what data left the gateway.

These actions do not require changes to the client code. The client continues to use its standard JSON‑producing library, and hoop.dev handles the inspection and transformation transparently. Teams can therefore add DLP controls to existing pipelines without a disruptive rewrite.

Getting started

Deploy the gateway with the quick‑start Docker Compose file, connect it to your identity provider, and register the target service that receives JSON. The getting‑started guide walks through each step in detail. Once the gateway is running, import your JSON schema and enable the masking feature from the learn section.

FAQ

Does hoop.dev store the original unmasked JSON?

No. The gateway masks the data in‑flight and forwards only the transformed payload. The original document never leaves the client process.

Can I use hoop.dev with CI/CD pipelines?

Yes. The pipeline can call the standard client (for example, curl or a language‑specific SDK) and route the request through hoop.dev. The gateway enforces the same DLP rules for automated jobs as it does for interactive users.

For a deeper dive into configuration details, see the learn section. To launch the gateway on your own infrastructure, follow the getting‑started guide. The open‑source repository on GitHub provides the full codebase and contribution guidelines: https://github.com/hoophq/hoop.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts