Load Balancing — Core Concepts
Interview Relevance: Very High — Every scalable system needs a load balancer. Know the layers, algorithms, and trade-offs cold.
A load balancer distributes incoming traffic across multiple backend servers to ensure no single server becomes a bottleneck, improve fault tolerance, and enable horizontal scaling.
Why Load Balancing Matters
Hardware vs. Software Load Balancers
Comparison Table
| Factor | Hardware LB | Software LB |
|---|---|---|
| Examples | F5, Citrix ADC | Nginx, HAProxy, AWS ALB, Envoy |
| Cost | $50K–$500K+ | Free (or pay-per-use on cloud) |
| Throughput | Millions of req/sec (ASIC) | Hundreds of thousands req/sec |
| Scalability | Fixed capacity | Add more instances |
| Flexibility | Limited (proprietary) | Highly configurable |
| Deployment | Rack-mounted physical box | Container, VM, managed service |
| Use case | Telcos, banks, on-premise | Startups to hyperscalers, cloud |
Interview answer: "In a modern cloud-native system, I'd use a software load balancer — either a managed service like AWS ALB/NLB or self-hosted Nginx/HAProxy. Hardware LBs are rarely justified unless you're a telco or have extreme on-premise throughput requirements."
Layer 4 vs. Layer 7 Load Balancing
This is the most commonly tested load balancing concept in FAANG interviews.
How Layer 4 Routes Traffic
How Layer 7 Routes Traffic
Layer 4 vs. Layer 7 Comparison
| Feature | L4 (Network LB) | L7 (Application LB) |
|---|---|---|
| OSI Layer | Transport (TCP/UDP) | Application (HTTP) |
| Inspects | IP + Port only | Full HTTP headers, URL, cookies |
| Routing basis | IP/port rules | URL path, host, headers, JWT |
| SSL termination | Pass-through (TLS stays end-to-end) | ✅ Yes (decrypts at LB) |
| Performance | ✅ Faster (no packet inspection) | Slightly more CPU |
| Use case | Non-HTTP protocols, raw TCP | HTTP/HTTPS microservices |
| AWS equivalent | Network Load Balancer (NLB) | Application Load Balancer (ALB) |
| Examples | AWS NLB, HAProxy (TCP mode) | AWS ALB, Nginx, Envoy, Traefik |
When to use each:
- L4: Gaming servers (UDP), database proxies, raw TCP services, when you need the highest possible throughput
- L7: Web APIs, microservices routing, canary deployments, A/B testing, SSL termination
Load Balancing Algorithms
The algorithm determines which server gets the next request. Choosing the right one depends on your workload.
Algorithm 1 — Round Robin
The simplest algorithm. Requests are distributed sequentially across all servers in order.
✅ Simple, predictable, works well when all servers are equal
❌ Ignores server load — a slow request on Server 1 doesn't stop
new requests from being sent there
❌ Doesn't account for different server capacitiesAlgorithm 2 — Weighted Round Robin
Like Round Robin but servers with higher capacity receive proportionally more requests.
Distribution pattern: S1, S1, S1, S2, S2, S3, S1, S1, S1, S2, S2, S3...
✅ Accounts for heterogeneous server hardware
✅ Still simple to implement
❌ Still ignores real-time server loadAlgorithm 3 — Least Connections
Routes the next request to the server with the fewest active connections at that moment.
✅ Adapts to actual server load in real-time
✅ Excellent for long-lived connections (WebSockets, file uploads)
✅ Works well with heterogeneous request times
❌ Requires the LB to maintain connection state (slight overhead)Algorithm 4 — IP Hash (Session Stickiness)
The client's IP address is hashed to always route them to the same server — guaranteeing session affinity.
✅ Ensures a user always hits the same server
✅ Required for stateful apps with in-memory sessions
❌ Uneven distribution if some IPs have many more users
❌ If a server dies, all its users are rehashed → sessions lost
❌ Bad with NAT (many users share one IP → same server)Algorithm 5 — Consistent Hashing ⭐ (Most Important for Interviews)
Consistent hashing distributes requests across servers using a virtual ring, so when a server is added or removed, only a fraction of keys need to be remapped.
Why Consistent Hashing is Superior
Simple Hash (hash(key) mod N):
N=3 servers. Server 4 added → N=4
hash(key) mod 3 ≠ hash(key) mod 4
→ Almost ALL keys remapped!
→ Catastrophic cache invalidation at scale
Consistent Hashing:
N=3 servers → N=4 (add Server 4)
Only ~25% of keys remapped (1/N of the ring)
→ Minimal disruption ✅
→ Cache stays warm for 75% of keys ✅Virtual Nodes — Solving Uneven Distribution
Virtual Nodes: Each physical server is assigned multiple positions
(virtual nodes) on the ring. Cassandra uses 256 vnodes by default.
Result: Even distribution even with servers of different capacities
(give powerful servers more vnodes).Algorithm Decision Guide
Nginx vs. HAProxy — The Two Giants
Both are the industry-standard open-source software load balancers. Here's when to reach for each:
| Feature | Nginx | HAProxy |
|---|---|---|
| Primary role | Web server + LB + reverse proxy | Pure load balancer + TCP proxy |
| Static file serving | ✅ Excellent | ❌ Not designed for it |
| L4 TCP balancing | Limited | ✅ Excellent |
| L7 HTTP routing | ✅ Full feature | ✅ Full feature |
| WebSocket support | ✅ Yes | ✅ Yes |
| SSL termination | ✅ Native | ✅ Native |
| Health checks | Basic | ✅ Advanced (active + passive) |
| Stats dashboard | Via plugin | ✅ Built-in |
| Config syntax | Block-based | Line-based |
| Best for | HTTP workloads, static + dynamic | High-throughput TCP, databases |
High Availability: Making the LB Itself Fault-Tolerant
The Load Balancer itself can become a SPOF. Here's how to eliminate that:
Worked Example: Load Balancing for a URL Shortener
Revisiting our URL Shortener (1,200 writes/sec, 116,000 reads/sec):
Decision summary:
| Layer | Algorithm | Reason |
|---|---|---|
| ALB (HTTP) | Round Robin | Stateless app servers, equal specs |
| Redis Cluster | Consistent Hashing | Minimize cache invalidation on scale-out |
| Cassandra | Consistent Hashing (vnodes) | Even data distribution across nodes |
Interview Cheat Sheet
One-Line Summaries
Round Robin: Sequential distribution — simple, works for equal servers
Weighted R.R.: Like Round Robin but servers get proportional traffic by weight
Least Connections: New request → server with fewest active connections (best for variable latency)
IP Hash: Same IP always hits same server (session affinity, but inflexible)
Consistent Hashing: Hash ring — add/remove servers with minimal remapping (~1/N keys move)
Layer 4 LB: Routes on IP+port — fast, protocol-agnostic, no content inspection
Layer 7 LB: Routes on HTTP content — smart routing, SSL termination, canary deploysThe Three Interview Phrases
ON CHOOSING AN ALGORITHM:
"The app servers are stateless so I'll use Round Robin on the ALB —
simple and it distributes evenly. For the Redis cluster, I'll use
Consistent Hashing so that adding a cache node only remaps ~1/N of
keys instead of invalidating the whole cache."
ON LAYER 4 vs LAYER 7:
"I'd use a Layer 7 load balancer (AWS ALB) for the HTTP API tier
because I need SSL termination and URL path-based routing to
different microservices. For the database connection pool, I'd
use Layer 4 (AWS NLB) since it's raw TCP and I want the lowest
possible latency."
ON HIGH AVAILABILITY:
"The load balancer itself is a potential SPOF. In AWS I'd use ALB
which is a managed service with multi-AZ redundancy built in.
For self-hosted Nginx/HAProxy, I'd run an Active-Passive pair
with Keepalived and VRRP to fail over a Virtual IP in under a second."Red Flags vs. Green Flags
| 🔴 Red Flag | 🟢 Green Flag |
|---|---|
| Use Round Robin for stateful apps | Use IP Hash or cookie-based sticky sessions for session affinity |
| Forget the LB is a SPOF | Always discuss HA pair or managed LB (ALB) |
| Use a single LB type for everything | L7 for HTTP services, L4 for DB proxies and TCP services |
| Say "consistent hashing" without explaining why | Explain: adds/removes nodes remap only ~1/N keys |
| Ignore virtual nodes | Know that vnodes fix the uneven distribution problem |
| Mix up L4 and L7 | L4 = IP+port only; L7 = HTTP headers, URL, cookies |
TIP
In a FAANG interview, mentioning consistent hashing with virtual nodes for cache and DB clusters is a strong senior signal. Most candidates only know Round Robin.
IMPORTANT
Always mention health checks. A load balancer without health checks sends traffic to dead servers. Say: "The ALB performs active health checks every 30 seconds on /health. A server failing 3 consecutive checks is removed from the pool."
