Load Balancing — Core Concepts

Interview Relevance: Very High — Every scalable system needs a load balancer. Know the layers, algorithms, and trade-offs cold.

A load balancer distributes incoming traffic across multiple backend servers to ensure no single server becomes a bottleneck, improve fault tolerance, and enable horizontal scaling.

Why Load Balancing Matters

Hardware vs. Software Load Balancers

Comparison Table

Factor	Hardware LB	Software LB
Examples	F5, Citrix ADC	Nginx, HAProxy, AWS ALB, Envoy
Cost	$50K–$500K+	Free (or pay-per-use on cloud)
Throughput	Millions of req/sec (ASIC)	Hundreds of thousands req/sec
Scalability	Fixed capacity	Add more instances
Flexibility	Limited (proprietary)	Highly configurable
Deployment	Rack-mounted physical box	Container, VM, managed service
Use case	Telcos, banks, on-premise	Startups to hyperscalers, cloud

Interview answer: "In a modern cloud-native system, I'd use a software load balancer — either a managed service like AWS ALB/NLB or self-hosted Nginx/HAProxy. Hardware LBs are rarely justified unless you're a telco or have extreme on-premise throughput requirements."

Layer 4 vs. Layer 7 Load Balancing

This is the most commonly tested load balancing concept in FAANG interviews.

How Layer 4 Routes Traffic

How Layer 7 Routes Traffic

Layer 4 vs. Layer 7 Comparison

Feature	L4 (Network LB)	L7 (Application LB)
OSI Layer	Transport (TCP/UDP)	Application (HTTP)
Inspects	IP + Port only	Full HTTP headers, URL, cookies
Routing basis	IP/port rules	URL path, host, headers, JWT
SSL termination	Pass-through (TLS stays end-to-end)	✅ Yes (decrypts at LB)
Performance	✅ Faster (no packet inspection)	Slightly more CPU
Use case	Non-HTTP protocols, raw TCP	HTTP/HTTPS microservices
AWS equivalent	Network Load Balancer (NLB)	Application Load Balancer (ALB)
Examples	AWS NLB, HAProxy (TCP mode)	AWS ALB, Nginx, Envoy, Traefik

When to use each:
L4: Gaming servers (UDP), database proxies, raw TCP services, when you need the highest possible throughput
L7: Web APIs, microservices routing, canary deployments, A/B testing, SSL termination

Load Balancing Algorithms

The algorithm determines which server gets the next request. Choosing the right one depends on your workload.

Algorithm 1 — Round Robin

The simplest algorithm. Requests are distributed sequentially across all servers in order.

✅ Simple, predictable, works well when all servers are equal
❌ Ignores server load — a slow request on Server 1 doesn't stop
   new requests from being sent there
❌ Doesn't account for different server capacities

Algorithm 2 — Weighted Round Robin

Like Round Robin but servers with higher capacity receive proportionally more requests.

Distribution pattern: S1, S1, S1, S2, S2, S3, S1, S1, S1, S2, S2, S3...

✅ Accounts for heterogeneous server hardware
✅ Still simple to implement
❌ Still ignores real-time server load

Algorithm 3 — Least Connections

Routes the next request to the server with the fewest active connections at that moment.

✅ Adapts to actual server load in real-time
✅ Excellent for long-lived connections (WebSockets, file uploads)
✅ Works well with heterogeneous request times
❌ Requires the LB to maintain connection state (slight overhead)

Algorithm 4 — IP Hash (Session Stickiness)

The client's IP address is hashed to always route them to the same server — guaranteeing session affinity.

✅ Ensures a user always hits the same server
✅ Required for stateful apps with in-memory sessions
❌ Uneven distribution if some IPs have many more users
❌ If a server dies, all its users are rehashed → sessions lost
❌ Bad with NAT (many users share one IP → same server)

Algorithm 5 — Consistent Hashing ⭐ (Most Important for Interviews)

Consistent hashing distributes requests across servers using a virtual ring, so when a server is added or removed, only a fraction of keys need to be remapped.

Why Consistent Hashing is Superior

Simple Hash (hash(key) mod N):
  N=3 servers. Server 4 added → N=4
  hash(key) mod 3 ≠ hash(key) mod 4
  → Almost ALL keys remapped!
  → Catastrophic cache invalidation at scale

Consistent Hashing:
  N=3 servers → N=4 (add Server 4)
  Only ~25% of keys remapped (1/N of the ring)
  → Minimal disruption ✅
  → Cache stays warm for 75% of keys ✅

Virtual Nodes — Solving Uneven Distribution

Virtual Nodes: Each physical server is assigned multiple positions
(virtual nodes) on the ring. Cassandra uses 256 vnodes by default.

Result: Even distribution even with servers of different capacities
(give powerful servers more vnodes).

Algorithm Decision Guide

Nginx vs. HAProxy — The Two Giants

Both are the industry-standard open-source software load balancers. Here's when to reach for each:

Feature	Nginx	HAProxy
Primary role	Web server + LB + reverse proxy	Pure load balancer + TCP proxy
Static file serving	✅ Excellent	❌ Not designed for it
L4 TCP balancing	Limited	✅ Excellent
L7 HTTP routing	✅ Full feature	✅ Full feature
WebSocket support	✅ Yes	✅ Yes
SSL termination	✅ Native	✅ Native
Health checks	Basic	✅ Advanced (active + passive)
Stats dashboard	Via plugin	✅ Built-in
Config syntax	Block-based	Line-based
Best for	HTTP workloads, static + dynamic	High-throughput TCP, databases

High Availability: Making the LB Itself Fault-Tolerant

The Load Balancer itself can become a SPOF. Here's how to eliminate that:

Worked Example: Load Balancing for a URL Shortener

Revisiting our URL Shortener (1,200 writes/sec, 116,000 reads/sec):

Decision summary:

Layer	Algorithm	Reason
ALB (HTTP)	Round Robin	Stateless app servers, equal specs
Redis Cluster	Consistent Hashing	Minimize cache invalidation on scale-out
Cassandra	Consistent Hashing (vnodes)	Even data distribution across nodes

Interview Cheat Sheet

One-Line Summaries

Round Robin:        Sequential distribution — simple, works for equal servers
Weighted R.R.:      Like Round Robin but servers get proportional traffic by weight
Least Connections:  New request → server with fewest active connections (best for variable latency)
IP Hash:            Same IP always hits same server (session affinity, but inflexible)
Consistent Hashing: Hash ring — add/remove servers with minimal remapping (~1/N keys move)
Layer 4 LB:         Routes on IP+port — fast, protocol-agnostic, no content inspection
Layer 7 LB:         Routes on HTTP content — smart routing, SSL termination, canary deploys

The Three Interview Phrases

ON CHOOSING AN ALGORITHM:
"The app servers are stateless so I'll use Round Robin on the ALB —
 simple and it distributes evenly. For the Redis cluster, I'll use
 Consistent Hashing so that adding a cache node only remaps ~1/N of
 keys instead of invalidating the whole cache."

ON LAYER 4 vs LAYER 7:
"I'd use a Layer 7 load balancer (AWS ALB) for the HTTP API tier
 because I need SSL termination and URL path-based routing to
 different microservices. For the database connection pool, I'd
 use Layer 4 (AWS NLB) since it's raw TCP and I want the lowest
 possible latency."

ON HIGH AVAILABILITY:
"The load balancer itself is a potential SPOF. In AWS I'd use ALB
 which is a managed service with multi-AZ redundancy built in.
 For self-hosted Nginx/HAProxy, I'd run an Active-Passive pair
 with Keepalived and VRRP to fail over a Virtual IP in under a second."

Red Flags vs. Green Flags

🔴 Red Flag	🟢 Green Flag
Use Round Robin for stateful apps	Use IP Hash or cookie-based sticky sessions for session affinity
Forget the LB is a SPOF	Always discuss HA pair or managed LB (ALB)
Use a single LB type for everything	L7 for HTTP services, L4 for DB proxies and TCP services
Say "consistent hashing" without explaining why	Explain: adds/removes nodes remap only ~1/N keys
Ignore virtual nodes	Know that vnodes fix the uneven distribution problem
Mix up L4 and L7	L4 = IP+port only; L7 = HTTP headers, URL, cookies

TIP

In a FAANG interview, mentioning consistent hashing with virtual nodes for cache and DB clusters is a strong senior signal. Most candidates only know Round Robin.

IMPORTANT

Always mention health checks. A load balancer without health checks sends traffic to dead servers. Say: "The ALB performs active health checks every 30 seconds on /health. A server failing 3 consecutive checks is removed from the pool."

Load Balancing — Core Concepts ​

Why Load Balancing Matters ​

Hardware vs. Software Load Balancers ​

Comparison Table ​

Layer 4 vs. Layer 7 Load Balancing ​

How Layer 4 Routes Traffic ​

How Layer 7 Routes Traffic ​

Layer 4 vs. Layer 7 Comparison ​

Load Balancing Algorithms ​

Algorithm 1 — Round Robin ​

Algorithm 2 — Weighted Round Robin ​

Algorithm 3 — Least Connections ​

Algorithm 4 — IP Hash (Session Stickiness) ​

Algorithm 5 — Consistent Hashing ⭐ (Most Important for Interviews) ​

Why Consistent Hashing is Superior ​

Virtual Nodes — Solving Uneven Distribution ​

Algorithm Decision Guide ​

Nginx vs. HAProxy — The Two Giants ​

High Availability: Making the LB Itself Fault-Tolerant ​

Worked Example: Load Balancing for a URL Shortener ​

Interview Cheat Sheet ​

One-Line Summaries ​

The Three Interview Phrases ​

Red Flags vs. Green Flags ​