A common misconception is that simply isolating large language models from each other eliminates prompt-injection risk. In reality, when one model invokes another, what we call a nested agent, the outer model can embed malicious prompts that the inner model blindly executes.
Teams building AI‑driven pipelines on GCP often connect agents directly to cloud services, share service‑account keys, and rely on the platform’s default IAM boundaries. The outer agent receives a user’s request, crafts a follow‑up query, and forwards it to a second model without any additional verification. Because the inner model trusts the outer call as if it were a legitimate user, any crafted prompt can cause the inner model to leak data, exfiltrate secrets, or perform prohibited actions. The risk is amplified when the outer model is exposed to untrusted input, such as a public chatbot endpoint.
Why nested agents amplify prompt-injection risk
Nested agents create a two‑step trust chain. The first step authenticates the user to the outer model; the second step trusts the outer model to act on behalf of the user when it talks to the inner model. If the outer model is compromised, or if an attacker learns how to influence its prompt construction, the inner model receives a malicious instruction that appears to come from a trusted source.
This pattern leaves three gaps:
- Uninspected payloads: The inner model never sees the original user context, so it cannot validate that the request aligns with policy.
- Credential exposure: Shared service‑account keys give each agent unrestricted access to downstream resources, making it easy for a compromised outer agent to act everywhere.
- No audit trail: Without a central observation point, teams cannot tell which prompts caused which downstream actions, hindering forensic analysis.
Addressing these gaps requires a control point that sits between the agents and the resources they reach. The control point must be able to see every request, enforce policy, and record outcomes before the request reaches the target.
What a gateway can enforce
When a gateway sits in the data path, it becomes the only place where enforcement can happen. The gateway can:
- Record each session: hoop.dev logs the full request and response stream for every agent interaction, giving teams a replayable audit trail.
- Mask sensitive fields: hoop.dev can redact secrets or personally identifiable information from responses before they are returned to the outer model.
- Require just‑in‑time approval: For high‑risk operations, hoop.dev can pause the request and route it to a human approver, preventing accidental leakage.
- Block dangerous commands: hoop.dev can inspect the payload and reject any instruction that matches a denylist, such as attempts to read environment variables or invoke privileged APIs.
All of these outcomes exist only because hoop.dev occupies the data path. Without the gateway, the outer agent would talk directly to the inner model or to GCP services, and none of the above controls would be enforceable.
