When an LLM endpoint is called directly from an application, a compromised service can issue unlimited prompts, exfiltrate data, and amplify attacks across the entire environment. The financial and reputational impact of such a blast radius can be severe. Most teams hand a static API key to every microservice, store it in code repositories, and let the service call the model without any runtime guardrails. If an attacker gains a foothold, they inherit that key and can flood the model with malicious queries, harvest sensitive responses, or use the model to generate phishing content at scale.
Why blast radius matters for MCP gateways
Microsoft Azure provides managed LLM services that are attractive for rapid development. However, the convenience of a single endpoint hidden behind a shared secret creates a single point of failure. A breach of one pod instantly expands to every other workload that reuses the same credential. The lack of per‑request visibility means security teams cannot tell who asked what, nor can they stop a dangerous prompt before it reaches the model.
The missing control: just‑in‑time access and audit
What teams need is a runtime enforcement layer that sits between the caller and the model. The layer must be able to:
- Require an explicit approval for high‑risk prompts.
- Record every request and response for later review.
- Mask or redact sensitive fields in the model’s output.
- Enforce least‑privilege policies that grant access only for the duration of a specific job.
Even with strong identity providers and Azure role‑based access control, those controls live only in the authentication step. They do not inspect the actual traffic that flows to the LLM, leaving the blast radius unchecked.
hoop.dev as the data‑path gateway
hoop.dev is a Layer 7 gateway that can be placed in front of any Azure‑hosted LLM endpoint. It authenticates users via OIDC or SAML, then proxies the request through an agent that runs inside the same virtual network as the model. Because the gateway sits in the data path, it can enforce the missing controls directly on the request and response.
hoop.dev records each session, so auditors have a complete replay of who asked which prompt and what the model returned. It can mask fields such as credit‑card numbers or personal identifiers before they are stored or displayed. For commands that match a risky pattern, hoop.dev triggers a just‑in‑time approval workflow, allowing a human operator to approve or deny the request in real time. All of these outcomes happen because hoop.dev is the only component that sees the traffic between the caller and the LLM.
