A load balancer is a device or software application that sits between users and a pool of backend servers. It receives incoming requests and distributes them across multiple servers according to a balancing algorithm. This prevents any single server from becoming a bottleneck and improves the overall availability, reliability and scalability of applications.
Load Balancing Algorithms
- Round Robin — Distributes requests sequentially to each server in turn. Simple and effective for identical servers.
- Least Connections — Sends the next request to the server with the fewest active connections.
- IP Hash — Routes a client to the same server based on their IP address. Useful for maintaining session state.
- Weighted Round Robin — Servers with more capacity receive proportionally more requests.
Layer 4 vs Layer 7 Load Balancing
- Layer 4 (Transport) — Routes based on IP and TCP/UDP port. Fast but no content inspection.
- Layer 7 (Application) — Routes based on HTTP content (URL path, headers, cookies). Smarter and more flexible.
Health Checks
Load balancers continuously perform health checks on backend servers. If a server fails, the load balancer automatically removes it from rotation until it recovers, providing automatic failover.