Skip to content

Load Balancing — Core Concepts

Interview Relevance: Very High — Every scalable system needs a load balancer. Know the layers, algorithms, and trade-offs cold.

A load balancer distributes incoming traffic across multiple backend servers to ensure no single server becomes a bottleneck, improve fault tolerance, and enable horizontal scaling.


Why Load Balancing Matters


Hardware vs. Software Load Balancers

Comparison Table

FactorHardware LBSoftware LB
ExamplesF5, Citrix ADCNginx, HAProxy, AWS ALB, Envoy
Cost$50K–$500K+Free (or pay-per-use on cloud)
ThroughputMillions of req/sec (ASIC)Hundreds of thousands req/sec
ScalabilityFixed capacityAdd more instances
FlexibilityLimited (proprietary)Highly configurable
DeploymentRack-mounted physical boxContainer, VM, managed service
Use caseTelcos, banks, on-premiseStartups to hyperscalers, cloud

Interview answer: "In a modern cloud-native system, I'd use a software load balancer — either a managed service like AWS ALB/NLB or self-hosted Nginx/HAProxy. Hardware LBs are rarely justified unless you're a telco or have extreme on-premise throughput requirements."


Layer 4 vs. Layer 7 Load Balancing

This is the most commonly tested load balancing concept in FAANG interviews.

How Layer 4 Routes Traffic

How Layer 7 Routes Traffic

Layer 4 vs. Layer 7 Comparison

FeatureL4 (Network LB)L7 (Application LB)
OSI LayerTransport (TCP/UDP)Application (HTTP)
InspectsIP + Port onlyFull HTTP headers, URL, cookies
Routing basisIP/port rulesURL path, host, headers, JWT
SSL terminationPass-through (TLS stays end-to-end)✅ Yes (decrypts at LB)
Performance✅ Faster (no packet inspection)Slightly more CPU
Use caseNon-HTTP protocols, raw TCPHTTP/HTTPS microservices
AWS equivalentNetwork Load Balancer (NLB)Application Load Balancer (ALB)
ExamplesAWS NLB, HAProxy (TCP mode)AWS ALB, Nginx, Envoy, Traefik

When to use each:

  • L4: Gaming servers (UDP), database proxies, raw TCP services, when you need the highest possible throughput
  • L7: Web APIs, microservices routing, canary deployments, A/B testing, SSL termination

Load Balancing Algorithms

The algorithm determines which server gets the next request. Choosing the right one depends on your workload.

Algorithm 1 — Round Robin

The simplest algorithm. Requests are distributed sequentially across all servers in order.

✅ Simple, predictable, works well when all servers are equal
❌ Ignores server load — a slow request on Server 1 doesn't stop
   new requests from being sent there
❌ Doesn't account for different server capacities

Algorithm 2 — Weighted Round Robin

Like Round Robin but servers with higher capacity receive proportionally more requests.

Distribution pattern: S1, S1, S1, S2, S2, S3, S1, S1, S1, S2, S2, S3...

✅ Accounts for heterogeneous server hardware
✅ Still simple to implement
❌ Still ignores real-time server load

Algorithm 3 — Least Connections

Routes the next request to the server with the fewest active connections at that moment.

✅ Adapts to actual server load in real-time
✅ Excellent for long-lived connections (WebSockets, file uploads)
✅ Works well with heterogeneous request times
❌ Requires the LB to maintain connection state (slight overhead)

Algorithm 4 — IP Hash (Session Stickiness)

The client's IP address is hashed to always route them to the same server — guaranteeing session affinity.

✅ Ensures a user always hits the same server
✅ Required for stateful apps with in-memory sessions
❌ Uneven distribution if some IPs have many more users
❌ If a server dies, all its users are rehashed → sessions lost
❌ Bad with NAT (many users share one IP → same server)

Algorithm 5 — Consistent Hashing ⭐ (Most Important for Interviews)

Consistent hashing distributes requests across servers using a virtual ring, so when a server is added or removed, only a fraction of keys need to be remapped.

Why Consistent Hashing is Superior

Simple Hash (hash(key) mod N):
  N=3 servers. Server 4 added → N=4
  hash(key) mod 3 ≠ hash(key) mod 4
  → Almost ALL keys remapped!
  → Catastrophic cache invalidation at scale

Consistent Hashing:
  N=3 servers → N=4 (add Server 4)
  Only ~25% of keys remapped (1/N of the ring)
  → Minimal disruption ✅
  → Cache stays warm for 75% of keys ✅

Virtual Nodes — Solving Uneven Distribution

Virtual Nodes: Each physical server is assigned multiple positions
(virtual nodes) on the ring. Cassandra uses 256 vnodes by default.

Result: Even distribution even with servers of different capacities
(give powerful servers more vnodes).

Algorithm Decision Guide


Nginx vs. HAProxy — The Two Giants

Both are the industry-standard open-source software load balancers. Here's when to reach for each:

FeatureNginxHAProxy
Primary roleWeb server + LB + reverse proxyPure load balancer + TCP proxy
Static file serving✅ Excellent❌ Not designed for it
L4 TCP balancingLimited✅ Excellent
L7 HTTP routing✅ Full feature✅ Full feature
WebSocket support✅ Yes✅ Yes
SSL termination✅ Native✅ Native
Health checksBasic✅ Advanced (active + passive)
Stats dashboardVia plugin✅ Built-in
Config syntaxBlock-basedLine-based
Best forHTTP workloads, static + dynamicHigh-throughput TCP, databases

High Availability: Making the LB Itself Fault-Tolerant

The Load Balancer itself can become a SPOF. Here's how to eliminate that:


Worked Example: Load Balancing for a URL Shortener

Revisiting our URL Shortener (1,200 writes/sec, 116,000 reads/sec):

Decision summary:

LayerAlgorithmReason
ALB (HTTP)Round RobinStateless app servers, equal specs
Redis ClusterConsistent HashingMinimize cache invalidation on scale-out
CassandraConsistent Hashing (vnodes)Even data distribution across nodes

Interview Cheat Sheet

One-Line Summaries

Round Robin:        Sequential distribution — simple, works for equal servers
Weighted R.R.:      Like Round Robin but servers get proportional traffic by weight
Least Connections:  New request → server with fewest active connections (best for variable latency)
IP Hash:            Same IP always hits same server (session affinity, but inflexible)
Consistent Hashing: Hash ring — add/remove servers with minimal remapping (~1/N keys move)
Layer 4 LB:         Routes on IP+port — fast, protocol-agnostic, no content inspection
Layer 7 LB:         Routes on HTTP content — smart routing, SSL termination, canary deploys

The Three Interview Phrases

ON CHOOSING AN ALGORITHM:
"The app servers are stateless so I'll use Round Robin on the ALB —
 simple and it distributes evenly. For the Redis cluster, I'll use
 Consistent Hashing so that adding a cache node only remaps ~1/N of
 keys instead of invalidating the whole cache."

ON LAYER 4 vs LAYER 7:
"I'd use a Layer 7 load balancer (AWS ALB) for the HTTP API tier
 because I need SSL termination and URL path-based routing to
 different microservices. For the database connection pool, I'd
 use Layer 4 (AWS NLB) since it's raw TCP and I want the lowest
 possible latency."

ON HIGH AVAILABILITY:
"The load balancer itself is a potential SPOF. In AWS I'd use ALB
 which is a managed service with multi-AZ redundancy built in.
 For self-hosted Nginx/HAProxy, I'd run an Active-Passive pair
 with Keepalived and VRRP to fail over a Virtual IP in under a second."

Red Flags vs. Green Flags

🔴 Red Flag🟢 Green Flag
Use Round Robin for stateful appsUse IP Hash or cookie-based sticky sessions for session affinity
Forget the LB is a SPOFAlways discuss HA pair or managed LB (ALB)
Use a single LB type for everythingL7 for HTTP services, L4 for DB proxies and TCP services
Say "consistent hashing" without explaining whyExplain: adds/removes nodes remap only ~1/N keys
Ignore virtual nodesKnow that vnodes fix the uneven distribution problem
Mix up L4 and L7L4 = IP+port only; L7 = HTTP headers, URL, cookies

TIP

In a FAANG interview, mentioning consistent hashing with virtual nodes for cache and DB clusters is a strong senior signal. Most candidates only know Round Robin.

IMPORTANT

Always mention health checks. A load balancer without health checks sends traffic to dead servers. Say: "The ALB performs active health checks every 30 seconds on /health. A server failing 3 consecutive checks is removed from the pool."

Released under the ISC License.