It’s fast on small datasets, but when you move terabytes across complex network topologies, every pain point shows. Slow sync on millions of tiny files. CPU spikes from checksum calculations. I/O wait times that stack like bricks. Latency bottlenecks on high-latency links. Rsync’s delta-transfer algorithm helps, but not enough when metadata overhead dominates.
The most common rsync pain points?
- Performance degradation with large file counts. Even if total size is modest, filesystem calls kill throughput.
- Checksum computation overhead under heavy load. Rsync reads and hashes entire files, slowing sync cycles.
- Out-of-order file updates when multiple processes write during sync, creating inconsistency.
- Sparse error handling on flaky connections, requiring manual retries.
- Limited parallelism by design—rsync runs single-threaded, leaving CPU cores idle.
The fixes are partial. Increasing --block-size can speed large-file transfers but hurts delta efficiency. --whole-file can skip checksums but risks overwriting changes. Wrapping rsync in GNU Parallel or custom scripts adds concurrency, but complexity rises fast.