Every second after that was chaos—viewers dropped, CPU load spiked somewhere else, and the whole pipeline teetered. The problem wasn’t FFmpeg. FFmpeg was doing its job. It was everything around it—brittle setups, no redundancy, and no real load balancer tuned for high-volume video workloads.
A real FFmpeg load balancer isn’t just a reverse proxy with round-robin. It’s a layer that understands the nature of video encoding and streaming. It tracks CPU load, GPU utilization, transcoding queue lengths, and disk throughput. It isn’t blind—it routes based on actual encoding stress. Without it, a single source can drown one node while others sit idle.
The architecture is simple in theory. Multiple FFmpeg workers sit behind a controller. The controller measures active job states and assigns new transcodes to the worker with the most capacity at that moment. Workers register their health in real-time. Dropouts are detected quickly, and jobs are retried without anyone noticing. For live streams, this means frames keep flowing. For VOD, it means batch jobs finish faster with predictable throughput.
The key is avoiding static routing. FFmpeg jobs have unpredictable durations and resource spikes. A proper load balancer for FFmpeg has to react to changing workloads on the fly—polling metrics at short intervals, keeping a lightweight state map in memory, and applying allocation logic that respects both hardware limits and codec constraints.