How can you keep the blast radius small when a single query against a vector database can expose millions of embeddings?
Most teams treat a vector store like any other internal service: a shared service account is created, the password is checked into a repository, and every engineer, script, or automated job uses the same credential to connect directly. The connection bypasses any central policy point, so there is no visibility into who ran which similarity search, what data was returned, or whether a query was safe. When a breach occurs, the attacker inherits the same unrestricted access, and the potential impact spreads across every downstream model that relies on those embeddings.
In practice this means that a compromised CI pipeline can issue a bulk dump of vectors, a rogue developer can experiment with high‑dimensional queries that overload the index, and security auditors have no reliable record of which queries were executed. The environment lacks just‑in‑time approval, inline data masking, or session recording. The result is a large blast radius that can affect data privacy, model performance, and compliance posture.
The core problem we need to fix is the lack of a control surface that can evaluate each request before it reaches the vector database. Even if we introduce identity‑aware tokens or tighten IAM policies, the request still travels straight to the database engine. Without an intervening gateway, there is no place to enforce per‑query limits, mask returned vectors that contain sensitive identifiers, or capture an audit trail for later review.
Why blast radius matters for vector databases
Vector databases store high‑dimensional representations of raw data. A single record may contain a user’s personal information, a proprietary document, or a piece of code. When a query returns a large set of vectors, the implicit data exposure can be far greater than the original request. Controlling blast radius therefore means limiting how many vectors can be retrieved, ensuring that sensitive fields are redacted, and guaranteeing that every access is attributable to a specific identity.
Traditional database firewalls focus on SQL syntax or row‑level permissions, but they do not understand the semantics of similarity search. The unique risk profile of vector workloads requires a gateway that can inspect the protocol, apply policy, and enforce limits in real time.
Introducing a data‑path gateway
hoop.dev provides exactly that control surface. It sits between the client and the vector database, acting as a Layer 7 identity‑aware proxy. The gateway validates the user’s OIDC or SAML token, extracts group membership, and then decides whether the request is allowed, needs approval, or should be blocked.
