All posts

Chunking and IAM: What to Know

Many teams assume that splitting large result sets into smaller chunks automatically satisfies identity‑based access control. In reality, chunking only reduces the amount of data transferred per request; it does not guarantee that the requester is authorized for each piece of the dataset. IAM policies define which identities can read, write, or modify a resource, but they are typically evaluated once per connection or per API call. When an application pulls data page by page, the IAM check may

Free White Paper

AWS IAM Policies + End-to-End Encryption: The Complete Guide

Architecture patterns, implementation strategies, and security best practices. Delivered to your inbox.

Free. No spam. Unsubscribe anytime.

Many teams assume that splitting large result sets into smaller chunks automatically satisfies identity‑based access control. In reality, chunking only reduces the amount of data transferred per request; it does not guarantee that the requester is authorized for each piece of the dataset.

IAM policies define which identities can read, write, or modify a resource, but they are typically evaluated once per connection or per API call. When an application pulls data page by page, the IAM check may happen only on the initial request, leaving subsequent pages unchecked. This gap can let a compromised credential harvest an entire table by iterating through all chunks, even though the policy only intended a limited view.

Why IAM alone is insufficient for chunked access

IAM provides two essential capabilities: identity verification and static permission mapping. The verification step occurs before the data path is opened, establishing who the caller is. The permission mapping step decides whether the caller may start a session, but it does not inspect the payload that flows after the session begins. When a client asks for the first 100 rows of a table, the IAM engine says “allowed,” and the gateway opens the connection. Subsequent requests for rows 101‑200, 201‑300, and so on travel through the same open channel without additional IAM evaluation.

This model leaves three risks unaddressed:

  • Over‑exposure: A user with permission to view a subset may still retrieve the full dataset by paging through chunks.
  • Missing audit trails: IAM logs record the initial connection but not each chunk request, making it hard to prove exactly what data was accessed.
  • In‑flight data leakage: Sensitive fields (PII, secrets) travel unmodified unless a downstream component masks them.

To close these gaps, enforcement must happen where the data actually flows – the data path itself.

Placing enforcement in the data path

When a gateway sits between the identity provider and the target system, it can evaluate policy on every request, not just the first handshake. The gateway can:

  • Re‑check IAM attributes for each chunk request, ensuring the caller’s scope still applies.
  • Record each chunk interaction, creating a fine‑grained audit trail that shows exactly which rows were returned.
  • Apply inline masking to sensitive columns on a per‑response basis, preventing exposure of PII even when the caller is authorized for the overall table.
  • Require just‑in‑time approval for high‑risk chunks, such as those that cross a row‑count threshold or contain financial data.

These capabilities exist only because the gateway intercepts the protocol stream. Without that interception, the connection would remain a black box after IAM has granted access.

Continue reading? Get the full guide.

AWS IAM Policies + End-to-End Encryption: Architecture Patterns & Best Practices

Free. No spam. Unsubscribe anytime.

How hoop.dev provides the required data‑path controls

hoop.dev is a Layer 7 access gateway that sits between identities and infrastructure. It receives the OIDC or SAML token, validates the identity, and then proxies the connection to the target system. Because the proxy runs in the data path, hoop.dev can enforce the controls described above:

  • Chunk‑level IAM checks: hoop.dev inspects each request for additional rows and re‑applies the caller’s IAM attributes before forwarding.
  • Fine‑grained audit: hoop.dev records every chunk request and response, giving teams a complete replayable log for compliance and incident response.
  • Inline masking: hoop.dev can redact or hash sensitive fields in each chunk, ensuring that downstream consumers never see raw PII.
  • Just‑in‑time approval: For chunks that exceed a configurable size or contain regulated data, hoop.dev can pause the request and route it to an approver before allowing it to continue.

All of these enforcement outcomes depend on hoop.dev’s presence in the data path; the IAM system alone cannot provide them.

Implementing chunk‑aware IAM with hoop.dev

Start by deploying the gateway using the official Docker Compose quick‑start. The deployment automatically configures OIDC authentication and enables masking and guardrails out of the box. Register the target database or service as a connection, and define the chunk‑size policy in the gateway’s configuration. Once the gateway is running, any client that connects through hoop.dev – whether a psql client, a custom API, or an AI‑driven agent – will have its chunk requests inspected, logged, and optionally masked.

For detailed steps, see the getting‑started guide and the broader feature documentation on the learn site. The source code and self‑hosting instructions are available in the public repository.

FAQ

Does hoop.dev replace my existing IAM system?

No. hoop.dev relies on your identity provider to verify who is making the request. It augments IAM by enforcing policies on every chunk that passes through the data path.

Will masking affect performance?

Masking is performed inline at the protocol layer. In most workloads the overhead is negligible, and the security benefit outweighs the small latency increase.

Can I use hoop.dev with existing CI/CD pipelines?

Yes. Because hoop.dev proxies standard protocols, you can point any tool that already talks to your database or API at the gateway without code changes.

Visit the open‑source repository on GitHub to get started and contribute.

Open source

Save the open-source gateway for agent data access

Hoop is MIT-licensed infrastructure for controlling how AI agents reach production data. Star hoophq/hoop so you can inspect it, deploy it, or share it when your team starts governing agent access.

Star and save the repo →More posts