The request came in at midnight. The API was timing out. The service was fine, the network was fine, but the calls were choking on something invisible. The issue was the Load Balancer. More specifically, GRPCS prefix routing.
GRPCS is critical when tunneling gRPC over TLS. But without correct prefix configuration in the load balancer, requests splinter and fail. Most engineers discover this at the worst possible time—when latency spikes and error rates climb. The fix starts with knowing how the prefix works and how each hop treats it.
A load balancer that supports GRPCS prefix routing must correctly map the request path to the upstream gRPC service. This means respecting the :authority header, ensuring TLS termination where intended, and preserving the prefix through internal hops. Some cloud load balancers mangle or strip the path if not explicitly configured. Others default to HTTP/2 rules that collide with gRPC’s expectations.
The safest setup begins with forcing HTTP/2 for all front-end connections, enabling ALPN to negotiate GRPCS, and explicitly defining route prefixes in configuration. Route definitions must match the gRPC service paths exactly. A mismatch between the prefix in the load balancer and the server handler will break streams silently. Observability here is essential—log at the load balancer and at the gRPC server, and inspect how the prefix travels.