When reranking pipelines run without hidden assistants, teams see predictable latency, clear audit trails, and confidence that no unseen model, often called shadow ai, is influencing results.
In practice, many organizations have slipped into a pattern where a downstream service calls a large language model (LLM) to reorder search results, recommendations, or query answers. The call is made directly from the reranking code, using a shared API key that lives in a configuration file or environment variable. The LLM runs in a third‑party cloud, and the request and response travel over the public internet without any intermediate checks. Because the call is treated like any other outbound HTTP request, there is no visibility into who triggered it, what data was sent, or whether the response was appropriate.
Why the current approach is fragile
The unsanitized state looks like this: a single credential grants every service in the organization the ability to invoke the external model; the credential is rotated only when a breach is discovered; the reranking service logs only its own internal metrics, not the payload sent to the model; and any sensitive user data that ends up in the prompt is never masked or reviewed. This creates a classic shadow AI situation, an invisible, unmanaged AI that can exfiltrate data, introduce bias, or generate unsafe content without any guardrails.
Because the request reaches the LLM directly, the organization cannot enforce just‑in‑time approval for risky prompts, cannot mask personally identifiable information before it leaves the network, and cannot replay the interaction for forensic analysis. The setup decides who can start the request – the service identity that holds the shared key – but it provides no enforcement on the data path itself.
What still needs to be fixed
The precondition we must address is the lack of a controlled gateway between the reranking service and the external model. Even if we introduce strict identity management, the request will still travel straight to the LLM, bypassing any audit, masking, or approval step. In other words, the problem of shadow AI is not solved by rotating keys alone; the request still lands on the target without any visibility or policy enforcement.
What remains open is the need for a data‑path enforcement point that can inspect, record, and optionally block or transform the traffic. The solution must sit where the request passes, not merely at the identity layer.
hoop.dev as the data‑path gateway
hoop.dev provides exactly that enforcement layer. It acts as an identity‑aware proxy that sits between the reranking application and the external LLM endpoint. The service authenticates to hoop.dev using OIDC tokens, and hoop.dev verifies the token, extracts group membership, and decides whether the request is allowed to proceed.
