Database Scaling — Core Concepts

Interview Relevance: Critical — "How would you scale this to 10× traffic?" is asked in every FAANG interview. Know replication, sharding, and consistent hashing cold.

A single database server has hard limits. At scale, you need strategies to spread load across multiple machines. There are two fundamental dimensions:

Read Scaling → Replication (more readers)
Write Scaling → Sharding / Partitioning (more writers)

The Scaling Ladder

Interview tip: Always go through this ladder. Jumping straight to sharding is over-engineering and a red flag.

Strategy 1 — Master-Slave Replication

How It Works

One node (Master) accepts all writes. Changes are streamed to one or more Slave (Replica) nodes which serve reads only.

Synchronous vs. Asynchronous Replication

Mode	Durability	Write Latency	Use Case
Synchronous	✅ Zero data loss	Higher (waits for replica ACK)	Financial transactions, critical data
Asynchronous	⚠️ Small replication lag	Lower (no wait)	Most web apps, analytics
Semi-synchronous	✅ At least 1 replica ACK	Moderate	Middle ground (MySQL default option)

Failover: What Happens When Master Dies?

Tools for automatic failover:

MySQL: Orchestrator, MHA (Master High Availability)
PostgreSQL: Patroni, repmgr
Redis: Redis Sentinel, Redis Cluster

Master-Slave Trade-offs

✅ PROS:
  - Read throughput scales linearly (add more replicas)
  - Read replicas can serve reporting/analytics without impacting production
  - Simple to set up (built into MySQL, PostgreSQL, MongoDB)
  - Replicas can be used for backups and disaster recovery

❌ CONS:
  - Master is still a single point of failure for WRITES
  - Replication lag → reads from replica may return stale data
  - If master dies and async replicas are behind → potential data loss
  - All write traffic still funnels through one node

Strategy 2 — Multi-Master Replication

Every node can accept both reads and writes. Changes are replicated between all masters.

The Core Problem: Write Conflicts

Conflict Resolution Strategies:

Strategy	How It Works	Risk
Last-Write-Wins (LWW)	Timestamp determines winner	Clock skew → wrong value wins
Application-level	App code detects and merges	Complex logic required
CRDT	Data structures designed for merging	Limited to specific data types
Avoid-conflicts	Route same entity to same master	Partially loses multi-master benefit

Multi-Master is RARELY the right answer in interviews.
Use it only for:
  ✅ Active-active geo-distributed writes (global apps)
  ✅ When regional write autonomy is required
  ✅ When you can accept eventual consistency with conflict resolution

For most systems: Prefer Master-Slave + Sharding instead.

Strategy 3 — Sharding (Horizontal Partitioning)

Split the dataset itself across multiple independent database servers. Each server (shard) owns a subset of the data.

Without Sharding vs. With Sharding

Shard Key Selection (Critical)

The shard key (also called partition key) determines which shard stores a piece of data. This is the most important sharding decision.

Good shard key properties:

✅ High cardinality — many distinct values (user_id, order_id, URL hash)
✅ Evenly distributed — no hot spots
✅ Immutable — never changes (moving data between shards is expensive)
❌ Avoid: timestamp, status, country/region (low cardinality → hot shards)

Consistent Hashing for Sharding

Simple hash(key) % N breaks catastrophically when N changes (add/remove a shard).

Simple Modulo Hashing:
  3 shards: hash("alice") % 3 = 0 → Shard 0
  Add shard: hash("alice") % 4 = 3 → Shard 3  ← MOVED!

  On adding 1 shard to 3: ~75% of all keys move!
  → Massive data migration + cache invalidation

Consistent hashing solves this by arranging servers on a virtual ring:

Consistent Hashing in the Real World

System	Uses Consistent Hashing For
Cassandra	Token ring — each node owns a token range
DynamoDB	Partition key hashed to virtual node
Redis Cluster	16,384 hash slots distributed across nodes
Chord / Dynamo	Academic foundations of modern DHTs
Memcached	Client-side consistent hashing

Sharding Strategies

Range-Based Sharding

✅ Range queries are efficient (e.g., "get all orders from Jan")
✅ Simple to reason about
❌ Hot shards: recent data always hits the last shard
❌ Uneven distribution if data isn't uniformly spread

Hash-Based Sharding

✅ Even distribution by design
✅ No hot shards
❌ Range queries are expensive (must query all shards and merge)
❌ Adding/removing shards causes key redistribution (use consistent hashing!)

Directory-Based Sharding

A lookup table (shard map) stores which shard holds which data.

✅ Maximum flexibility — can move data between shards without changing key scheme
✅ Supports heterogeneous shard sizes
❌ Lookup table is itself a bottleneck and SPOF (must be cached)
❌ Extra network hop per request

The Sharding Challenges

Cross-Shard Joins

Solution: Design your shard key so that data that's queried together is stored together (co-location).

Resharding (Shard Split)

When one shard grows too large, it must be split:

Mitigation: Use consistent hashing from the start — adding nodes only remaps ~1/N keys.

Worked Example: URL Shortener Scaling Journey

Phase 3 Cassandra Details:

Decision	Choice	Reason
Shard key	`short_code` hash	High cardinality, immutable, evenly distributed
Consistency level	`QUORUM` reads/writes	Balance between consistency and availability
Replication factor	3 per DC	Survive 1 node failure without data loss
Virtual nodes	256 vnodes/node	Even data distribution across heterogeneous nodes
Compaction	LeveledCompaction	Optimized for read-heavy redirect workload

Interview Cheat Sheet

One-Line Summaries

Master-Slave:       1 writer, N readers — scale reads linearly
Multi-Master:       N writers — avoids write SPOF, complex conflict resolution
Sharding:           Split data across servers — scale writes and storage
Range Sharding:     Data split by value ranges — great for scans, hot shard risk
Hash Sharding:      Data split by hash(key) — even distribution, no range queries
Directory Sharding: Lookup table maps keys to shards — flexible, extra hop
Consistent Hashing: Hash ring — only 1/N keys remapped when topology changes
Virtual Nodes:      Multiple ring positions per server — even load distribution

The Interview Phrase

"First I'd add read replicas and Redis caching — together these can
 handle 10x read traffic with zero schema changes. If writes become
 the bottleneck, I'd shard on user_id using consistent hashing so
 adding new shards only remaps ~1/N keys and avoids a full data
 migration. Cross-shard joins would be handled by denormalizing and
 co-locating related data on the same shard."

Red Flags vs. Green Flags

🔴 Red Flag	🟢 Green Flag
Jump straight to sharding	Follow the ladder: cache → replicas → sharding
Use `created_at` as shard key	Pick high-cardinality, immutable key (user_id, UUID)
Forget cross-shard join problem	Address it with co-location or denormalization
Say "just add nodes"	Explain consistent hashing to minimize resharding pain
Ignore replication lag	State whether async/sync and consequences for reads
Multi-master for everything	Reserve multi-master for active-active geo-distributed systems

IMPORTANT

In an interview, always explain why you chose the shard key. Say: "I'm sharding on user_id because it's high cardinality, immutable, and ensures all of a user's data is co-located, which avoids cross-shard joins for user-centric queries."

TIP

Mentioning the replication lag problem with Master-Slave and how you'd handle it (e.g., reading from master for critical paths, replica for non-critical) demonstrates real-world experience.

Database Scaling — Core Concepts ​

The Scaling Ladder ​

Strategy 1 — Master-Slave Replication ​

How It Works ​

Synchronous vs. Asynchronous Replication ​

Failover: What Happens When Master Dies? ​

Master-Slave Trade-offs ​

Strategy 2 — Multi-Master Replication ​

The Core Problem: Write Conflicts ​

Strategy 3 — Sharding (Horizontal Partitioning) ​

Without Sharding vs. With Sharding ​

Shard Key Selection (Critical) ​

Consistent Hashing for Sharding ​

Consistent Hashing in the Real World ​

Sharding Strategies ​

Range-Based Sharding ​

Hash-Based Sharding ​

Directory-Based Sharding ​

The Sharding Challenges ​

Cross-Shard Joins ​

Resharding (Shard Split) ​

Worked Example: URL Shortener Scaling Journey ​

Interview Cheat Sheet ​

One-Line Summaries ​

The Interview Phrase ​

Red Flags vs. Green Flags ​