A Load Balancer in a VPC with private subnets is supposed to be invisible to the outside world, yet serve your internal services with speed, security, and precision. The architecture must balance traffic without exposing internal endpoints, route requests through a proxy layer when rules demand, and recover instantly if something fails. The difference between a reliable setup and a fragile one lies in the details of how the Load Balancer, proxy, and subnets are wired together.
The backbone is the VPC. Your private subnets isolate workloads from public access, cutting the attack surface and giving you deterministic control over routing. Deploying the Load Balancer into public subnets while keeping targets in private subnets is standard, but when a proxy layer is added inside those private zones, the design gets more complex. The proxy acts as the gatekeeper to your services, applying access rules, authentication, protocol transformations, or traffic inspection—without ever making your private IPs public.
For high availability, each tier—Load Balancer, proxy instances, and backend nodes—must stretch across at least two availability zones. Route tables dictate the internal network flow. Security groups lock inbound and outbound paths to only what is required. NACLs serve as broader network-bound filters. Health checks on the Load Balancer must point at the proxy endpoint, and those proxy nodes need their own monitoring so that bad instances are cut out fast.
When performance is the goal, you must keep latency low between the Load Balancer and the proxies. That means proxies in private subnets inside the same VPC and region, with link-local security and short network hops. Auto-scaling policies should react to CPU, memory, or connection saturation, adding or removing proxy instances automatically. The Load Balancer target group should update in seconds, keeping traffic flowing without manual intervention.