⚡ Level 4 — Caching: Full Explanation
Source files:Overview and CDN Optimization
What is Caching?
Caching is the act of storing the result of an expensive operation in a fast-access layer so that future requests for the same result can be served immediately, without repeating the work.
IMPORTANT
Caching is not about storing data permanently. It is a performance optimization — a shortcut that trades slightly stale data (or memory) for drastically lower latency and reduced backend load.
The core principle is the 80/20 rule (Pareto Principle) applied to data access: roughly 80% of all requests ask for the same 20% of data. If you can serve that 20% from a cache, you eliminate 80% of your database traffic.
Part A — Caching Layers
Modern systems have caching at every tier of the stack. A request can be served by a cache before it even reaches your application server.
| Layer | Where It Lives | Latency | Best For |
|---|---|---|---|
| Browser | User's browser memory/disk | ~0ms | Static assets, API responses with proper headers |
| CDN | Edge servers worldwide | ~5–20ms (local POP) | Images, videos, CSS, JS, public API responses |
| Load Balancer | Nginx/HAProxy in-memory | ~0.1ms | Repeated identical requests (e.g., homepage) |
| Application | In-process Map or LRU | ~0ms (RAM) | Config values, rate limit counters, hot data |
| Distributed | Redis / Memcached cluster | ~0.5–2ms | Session data, user profiles, computed results |
| Database | DB engine's buffer pool | ~1–5ms | Query result sets, index pages |
TIP
When a request arrives, your system should check caches from the top (browser) down to the bottom (database). The earlier in the chain you get a hit, the cheaper and faster the response.
Part B — Cache Reading Strategies
How and when does data get read from or populated into the cache?
4.1 Cache-Aside (Lazy Loading)
The most common pattern. The application manages the cache manually. The cache is populated only when a cache miss occurs — "lazily."
✅ Pros:
- Cache only contains data that was actually requested — no wasted memory.
- Cache failures don't break the app (it falls back to DB).
- Works great for read-heavy workloads.
❌ Cons:
- First request for any key is always slow (cache miss penalty).
- Stale data if the DB is updated without invalidating the cache key.
💻 JS Example: Cache-Aside Pattern
const redis = require("redis");
const db = require("./db");
const cache = redis.createClient({ url: "redis://localhost:6379" });
async function getUserById(userId) {
const cacheKey = `user:${userId}`;
// ── Step 1: Check the cache first ──
const cached = await cache.get(cacheKey);
if (cached) {
console.log("⚡ Cache HIT — returned in <1ms");
return JSON.parse(cached);
}
// ── Step 2: Cache miss — go to the database ──
console.log("🐢 Cache MISS — querying database...");
const user = await db.query("SELECT * FROM users WHERE id = $1", [userId]);
if (!user.rows[0]) return null;
// ── Step 3: Populate cache for next time (TTL = 1 hour) ──
await cache.setEx(cacheKey, 3600, JSON.stringify(user.rows[0]));
return user.rows[0];
}
// ── On write: invalidate the cache key ──
async function updateUser(userId, updates) {
await db.query("UPDATE users SET name = $1 WHERE id = $2", [
updates.name,
userId,
]);
// Delete stale cache entry so the next read fetches fresh data
await cache.del(`user:${userId}`);
}4.2 Read-Through Cache
The cache itself is responsible for fetching from the database on a miss. The app only ever talks to the cache — the cache acts as a transparent proxy.
✅ Pros: Simpler application code — the app doesn't need to know about the DB. ❌ Cons: Cache misses are still slow; initial warm-up requires real traffic.
NOTE
Cache-Aside vs Read-Through: In Cache-Aside, the app holds all the logic. In Read-Through, the cache library handles the miss and the DB call. Libraries like node-cache-manager support read-through out of the box.
Part C — Cache Writing Strategies
How does data get written to both the cache and the database?
4.3 Write-Through Cache
Every write goes to the cache AND the database at the same time before the operation is considered complete.
✅ Pros: Cache is always consistent with the DB. No stale reads after writes. ❌ Cons: Write latency increases (must wait for both). Cache stores data that may never be read (write amplification).
Best for: Systems where stale reads are unacceptable (e.g., financial balances, inventory counts).
💻 JS Example: Write-Through Pattern
async function writeThrough(key, value, ttlSeconds) {
// Both writes must succeed — wrap in a transaction concept
await db.query(
"INSERT INTO kv_store (k, v) VALUES ($1, $2) ON CONFLICT (k) DO UPDATE SET v = $2",
[key, value]
);
await cache.setEx(key, ttlSeconds, JSON.stringify(value));
console.log(`✅ Write-through complete for key: ${key}`);
}4.4 Write-Behind (Write-Back) Cache
Writes go to the cache immediately and the database is updated asynchronously later (in batches).
✅ Pros: Extremely fast writes (just a cache set). DB writes can be batched for efficiency. ❌ Cons: Risk of data loss if the cache crashes before flushing. Complexity of the async queue.
Best for: High write-throughput, loss-tolerant data — counters, analytics events, view counts.
4.5 Write-Around Cache
Writes go directly to the database, bypassing the cache entirely. The cache is only populated on a future read miss.
✅ Pros: Prevents the cache from being flooded with write-once data that won't be re-read. ❌ Cons: First read after write is always a cache miss.
Best for: Write-heavy data that's rarely re-read (e.g., logs, batch import jobs).
Strategy Comparison
| Strategy | Write Latency | Read after Write | Data Loss Risk | Best For |
|---|---|---|---|---|
| Cache-Aside | Fast (DB only) | Stale until TTL | None | Read-heavy; general purpose |
| Read-Through | Fast (DB only) | Stale until TTL | None | Simplifying app code |
| Write-Through | Slow (cache + DB) | Always fresh | None | Consistency critical |
| Write-Behind | Very fast (cache) | Fresh | ⚠️ If cache crashes | High-throughput writes |
| Write-Around | Fast (DB only) | Miss on first read | None | Write-once, rarely-read data |
Part D — Cache Eviction Policies
When a cache is full, which key gets removed to make room for the new one?
Eviction Policy Breakdown
| Policy | Full Name | How It Decides | Real-World Analogy |
|---|---|---|---|
| LRU | Least Recently Used | Evict the key not accessed for the longest time | Throw out the book you haven't opened in the longest time |
| LFU | Least Frequently Used | Evict the key accessed the fewest total times | Remove the least-watched movie |
| FIFO | First In, First Out | Evict the oldest inserted key | Queue at a store — first in, first served (out) |
| Random | Random | Evict a random key | Blindly pull a card from a deck |
| TTL | Time-To-Live | Evict keys whose expiry time has passed | Food with an expiration date |
| MRU | Most Recently Used | Evict the most recently accessed key | Niche: useful for "once accessed, rarely again" patterns |
TIP
Redis eviction policies are configured via maxmemory-policy. Common choices:
allkeys-lru— Evict any LRU key (recommended for general caching)volatile-lru— Only evict keys with a TTL set (protects permanent keys)allkeys-lfu— LFU across all keys (better for very skewed access patterns)noeviction— Return errors when full (use when data loss is unacceptable)
💻 JS Example: LRU Cache Implementation
class LRUCache {
constructor(capacity) {
this.capacity = capacity;
this.cache = new Map(); // Map preserves insertion order in JS
}
get(key) {
if (!this.cache.has(key)) return -1;
// Refresh to "most recently used": delete and re-insert at tail
const value = this.cache.get(key);
this.cache.delete(key);
this.cache.set(key, value);
return value;
}
put(key, value) {
if (this.cache.has(key)) {
this.cache.delete(key); // Remove old position
} else if (this.cache.size >= this.capacity) {
// Evict the LRU key — the first entry in Map (oldest)
const lruKey = this.cache.keys().next().value;
this.cache.delete(lruKey);
console.log(`🗑️ Evicted LRU key: ${lruKey}`);
}
this.cache.set(key, value);
}
}
// ── Usage ──
const cache = new LRUCache(3);
cache.put("a", 1); // Cache: a
cache.put("b", 2); // Cache: a, b
cache.put("c", 3); // Cache: a, b, c
cache.get("a"); // Access "a" → moves to end: b, c, a
cache.put("d", 4); // "b" evicted (LRU). Cache: c, a, dPart E — Cache Pitfalls & How to Solve Them
4.6 Cache Stampede (Thundering Herd)
A hot cache key expires. Simultaneously, thousands of requests pour in — all miss the cache and all query the database at once, potentially crashing it.
Solutions:
Solution 1: Mutex Lock (Prevents Duplicate DB Calls)
const locks = new Map();
async function getWithMutex(key, fetchFn, ttl) {
// Check cache first
const cached = await cache.get(key);
if (cached) return JSON.parse(cached);
// If another request is already fetching, wait and retry
if (locks.has(key)) {
await new Promise((resolve) => setTimeout(resolve, 50));
return getWithMutex(key, fetchFn, ttl); // Retry
}
// We are the first: acquire the lock
locks.set(key, true);
try {
const data = await fetchFn(); // Only ONE call to the DB
await cache.setEx(key, ttl, JSON.stringify(data));
return data;
} finally {
locks.delete(key); // Always release the lock
}
}Solution 2: Probabilistic Early Expiration (XFetch)
Recompute the cache value slightly before it actually expires, so expiry never causes a stampede:
async function getWithEarlyExpiry(key, fetchFn, ttl, beta = 1) {
const stored = await cache.get(key);
if (stored) {
const { value, expiry, computeTime } = JSON.parse(stored);
const now = Date.now() / 1000;
// XFetch formula: recompute early if randomized condition is met
const shouldRecompute =
now - beta * computeTime * Math.log(Math.random()) >= expiry;
if (!shouldRecompute) return value;
}
const start = Date.now();
const freshData = await fetchFn();
const computeTime = (Date.now() - start) / 1000;
await cache.setEx(
key,
ttl,
JSON.stringify({
value: freshData,
expiry: Date.now() / 1000 + ttl,
computeTime,
})
);
return freshData;
}4.7 Cache Penetration
Requests for keys that never exist in the database (e.g., user:9999999) will always miss the cache and hammer the DB.
Solution: Cache Null Results + Bloom Filter
// ── Solution 1: Cache the null result with a short TTL ──
async function getUserSafe(userId) {
const cacheKey = `user:${userId}`;
const cached = await cache.get(cacheKey);
if (cached !== null) {
// "null" stored as a string means "DB confirmed this doesn't exist"
return cached === "NULL" ? null : JSON.parse(cached);
}
const user = await db.query("SELECT * FROM users WHERE id = $1", [userId]);
if (!user.rows[0]) {
// Cache the null — so subsequent requests for id=9999999 don't hit DB
await cache.setEx(cacheKey, 60, "NULL"); // Short 60s TTL
return null;
}
await cache.setEx(cacheKey, 3600, JSON.stringify(user.rows[0]));
return user.rows[0];
}NOTE
A Bloom Filter is a probabilistic data structure that can answer "does this key definitely NOT exist?" in O(1) with zero false negatives. Services like Redis (via RedisBloom) can check this before any cache or DB lookup, rejecting invalid IDs at the edge.
4.8 Cache Avalanche
Many cache keys expire at the same time (e.g., after a cache restart), causing a mass wave of DB hits simultaneously.
Solution: Jittered TTLs
Instead of all keys expiring at TTL = 3600, add random jitter so expirations are spread out:
function getJitteredTTL(baseTTL, jitterPercent = 0.2) {
// e.g., base = 3600s, jitter ±20% → TTL is between 2880s and 4320s
const jitter = baseTTL * jitterPercent;
return Math.floor(baseTTL + (Math.random() * 2 - 1) * jitter);
}
async function setWithJitter(key, value, baseTTL) {
const ttl = getJitteredTTL(baseTTL);
await cache.setEx(key, ttl, JSON.stringify(value));
console.log(`Set ${key} with TTL: ${ttl}s`);
}
// Example: 1000 keys set with base TTL of 1 hour
// → Will expire spread across 48 min to 72 min, not all at 60 min
for (let i = 0; i < 1000; i++) {
await setWithJitter(`product:${i}`, productData[i], 3600);
}4.9 Cache Consistency — The Invalidation Problem
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
When the source of truth (the database) changes, the cache must be updated or invalidated, or clients get stale data.
Invalidation Strategies:
| Strategy | How | Pros | Cons |
|---|---|---|---|
| Delete on write | DEL key after every DB write | Always fresh on next read | One extra miss per write |
| Update on write | SET key newValue after DB write | No miss penalty | Cache & DB write must be atomic |
| TTL-based | Let keys expire naturally | Simple | Stale for up to TTL duration |
| Event-driven | DB publishes change events, app listens and invalidates | Decoupled, scalable | Complex infrastructure (CDC, Kafka) |
💻 JS Example: Event-Driven Invalidation with Redis Pub/Sub
// ── Publisher (inside your write API) ──
async function updateUserAndPublish(userId, updates) {
await db.query("UPDATE users SET name = $1 WHERE id = $2", [
updates.name,
userId,
]);
// Notify all app servers to invalidate their caches
await redisPublisher.publish(
"cache:invalidate",
JSON.stringify({ key: `user:${userId}` })
);
}
// ── Subscriber (runs on every app server) ──
redisSubscriber.subscribe("cache:invalidate", (message) => {
const { key } = JSON.parse(message);
cache.del(key);
console.log(`🗑️ Cache invalidated for key: ${key}`);
});Part F — Distributed Caching
When a single cache server can't hold all data or handle all traffic, you need distributed caching — spreading keys across multiple cache nodes.
4.10 Consistent Hashing
The challenge: if you have 3 cache nodes and add a 4th, a naive hash(key) % N approach would remap every single key (catastrophic cache invalidation). Consistent hashing minimizes remapping.
The key insight: When you add or remove a node, only the keys in its adjacent zone need to be remapped — not all keys. With 3 nodes → 4 nodes, only ~25% of keys move (not 100%).
💻 JS Example: Consistent Hashing (Simplified)
const crypto = require("crypto");
class ConsistentHashRing {
constructor(nodes = [], replicas = 150) {
this.replicas = replicas; // Virtual nodes per real node
this.ring = new Map();
this.sortedKeys = [];
nodes.forEach((node) => this.addNode(node));
}
hash(key) {
return parseInt(
crypto.createHash("md5").update(key).digest("hex").slice(0, 8),
16
);
}
addNode(node) {
for (let i = 0; i < this.replicas; i++) {
const virtualKey = this.hash(`${node}:${i}`);
this.ring.set(virtualKey, node);
this.sortedKeys.push(virtualKey);
}
this.sortedKeys.sort((a, b) => a - b);
}
getNode(key) {
const hash = this.hash(key);
// Walk the ring clockwise to find the first node >= hash
for (const ringKey of this.sortedKeys) {
if (hash <= ringKey) return this.ring.get(ringKey);
}
// Wrap around to the first node
return this.ring.get(this.sortedKeys[0]);
}
}
const ring = new ConsistentHashRing([
"redis-node-1",
"redis-node-2",
"redis-node-3",
]);
console.log(ring.getNode("user:42")); // → redis-node-2
console.log(ring.getNode("product:99")); // → redis-node-1
console.log(ring.getNode("session:abc")); // → redis-node-3
// Add a 4th node — only ~25% of keys remapped
ring.addNode("redis-node-4");
console.log(ring.getNode("user:42")); // → might now be redis-node-4Part G — Redis Deep Dive
Redis is the industry-standard distributed cache. It is far more than a simple key-value store.
4.11 Redis Data Structures
| Data Structure | Redis Command | Use Case |
|---|---|---|
| String | SET / GET | Session tokens, simple counters, serialized JSON |
| Hash | HSET / HGET | User objects with many fields (avoid JSON re-serialization) |
| List | LPUSH / RPOP | Activity feeds, message queues, recent history |
| Set | SADD / SMEMBERS | Unique visitor tracking, tag systems |
| Sorted Set | ZADD / ZRANGE | Leaderboards, priority queues, rate limiting |
| HyperLogLog | PFADD / PFCOUNT | Approximate unique count (e.g., daily active users) |
| Pub/Sub | PUBLISH / SUBSCRIBE | Real-time notifications, cache invalidation broadcast |
| Stream | XADD / XREAD | Append-only log, event sourcing |
💻 JS Example: Redis Sorted Set for a Leaderboard
const redis = require("redis");
const client = redis.createClient();
async function updateScore(userId, score) {
// ZADD: Add userId with their score to the "game:leaderboard" sorted set
await client.zAdd("game:leaderboard", { score, value: `user:${userId}` });
}
async function getTopPlayers(count = 10) {
// ZREVRANGE: Get top N players, highest score first, WITH their scores
const leaders = await client.zRangeWithScores(
"game:leaderboard",
0,
count - 1,
{ REV: true }
);
return leaders.map((entry, i) => ({
rank: i + 1,
userId: entry.value.replace("user:", ""),
score: entry.score,
}));
}
async function getUserRank(userId) {
// ZREVRANK: Get the 0-based rank of a user (reversed = highest first)
const rank = await client.zRevRank("game:leaderboard", `user:${userId}`);
return rank !== null ? rank + 1 : null; // 1-indexed
}
// ── Demo ──
await updateScore("alice", 9500);
await updateScore("bob", 8200);
await updateScore("carol", 9800);
console.log(await getTopPlayers(3));
// → [ { rank: 1, userId: "carol", score: 9800 },
// { rank: 2, userId: "alice", score: 9500 },
// { rank: 3, userId: "bob", score: 8200 } ]
console.log(await getUserRank("alice")); // → 24.12 Redis vs Memcached
| Feature | Redis | Memcached |
|---|---|---|
| Data Types | Rich: Strings, Hashes, Lists, Sets, Sorted Sets, HyperLogLog, Streams | Strings only |
| Persistence | Optional (RDB snapshots, AOF log) | ❌ In-memory only |
| Replication | ✅ Primary-Replica | ❌ No |
| Clustering | ✅ Redis Cluster (16,384 hash slots) | Limited (client-side sharding) |
| Pub/Sub | ✅ Built-in | ❌ No |
| Lua Scripting | ✅ Atomic server-side scripts | ❌ No |
| Memory Efficiency | Good | Slightly better for pure string caching |
| Best For | Leaderboards, sessions, queues, pub/sub, rate limiting | Simple, high-throughput string/object caching |
NOTE
In practice, Redis has won. Memcached is simpler and marginally more memory-efficient for pure string caching, but Redis's richer feature set means almost all new systems choose Redis. AWS ElastiCache and Google Cloud Memorystore both support Redis.
Part H — CDN as a Cache
A Content Delivery Network is a globally distributed cache layer for static and semi-static content. See CDN Optimization for the full deep dive.
How CDN caching works:
- First user in Tokyo requests
logo.png→ CDN misses → fetches from Virginia origin → caches at Tokyo edge. - Every subsequent Tokyo user gets
logo.pngfrom the Tokyo edge in ~5ms instead of ~200ms. - Cache is controlled by
Cache-Controlheaders from your origin server.
HTTP Cache-Control Headers
// Express.js: Set proper cache headers on your responses
// ── Static assets (never change — fingerprinted filenames) ──
app.use(
"/static",
express.static("public", {
maxAge: "1y", // Cache for 1 year
immutable: true, // Tell browser: "This will never change"
// Cache-Control: public, max-age=31536000, immutable
})
);
// ── API response that can be cached for 5 minutes ──
app.get("/api/products", (req, res) => {
res.set({
"Cache-Control": "public, max-age=300, stale-while-revalidate=60",
// ^^^^^^ ^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^
// Anyone can cache | 5 min fresh | Serve stale for 60s while refreshing
ETag: `"products-v${dataVersion}"`, // For conditional requests
});
res.json(products);
});
// ── Private user data — never CDN cache ──
app.get("/api/user/me", (req, res) => {
res.set("Cache-Control", "private, no-store");
res.json(currentUser);
});Part I — Real-World Caching Architecture
4.13 Full-Stack Caching: E-commerce Example
Here is how a real e-commerce system like Amazon applies caching at every tier for a product page:
Result: A product page that gets 1 million views/day:
- ~95% served from CDN (zero app server cost)
- ~4% served from Redis (sub-millisecond)
- ~1% hits the database (after initial warm-up)
Part J — Cache Sizing & Monitoring
4.14 How to Size Your Cache
Use the 80/20 working set rule:
Working Set Size = Total Data Size × 0.20Example: If your users table is 50GB total, the "hot" 20% that gets queried most is ~10GB. Size your Redis instance to hold at least that.
// Monitor cache efficiency with hit ratio
async function getCacheStats() {
const info = await redis.info("stats");
const lines = info.split("\r\n");
const hits = parseInt(
lines.find((l) => l.startsWith("keyspace_hits"))?.split(":")[1] || "0"
);
const misses = parseInt(
lines.find((l) => l.startsWith("keyspace_misses"))?.split(":")[1] || "0"
);
const hitRatio = hits / (hits + misses);
console.log(`Cache Hit Ratio: ${(hitRatio * 100).toFixed(2)}%`);
// Alert if hit ratio drops below 90% — cache may be too small or TTLs too short
if (hitRatio < 0.9) {
console.warn(
"⚠️ Low cache hit ratio — consider increasing cache size or TTL"
);
}
return { hits, misses, hitRatio };
}4.15 Key Metrics to Monitor
| Metric | What It Means | Healthy Target |
|---|---|---|
| Hit Rate | hits / (hits + misses) | > 90% |
| Memory Usage | used_memory / maxmemory | < 80% (leave headroom) |
| Eviction Rate | Keys evicted per second | Should be near 0 |
| Command Latency | p99 latency for GET/SET | < 1ms |
| Connected Clients | Number of open connections | Within configured limit |
| Replication Lag | Replica behind primary | < 10ms |
The Golden Rules of Caching
| # | Principle | Why |
|---|---|---|
| 1 | Cache reads, not writes | Reads are 10–100x more common; caching writes adds complexity with little gain |
| 2 | Always set a TTL | Without expiry, stale data lives forever and you eventually run out of memory |
| 3 | Jitter your TTLs | Prevents the Cache Avalanche when many keys were set at the same time |
| 4 | Invalidate on write | Delete the cache key immediately after a DB update — don't wait for TTL |
| 5 | Cache at the right layer | Use the CDN for public data, Redis for private data, browser cache for assets |
| 6 | Monitor your hit rate | A hit rate < 90% means your cache is not doing its job — investigate |
| 7 | Don't cache everything | Write-once, never-read data wastes memory and can cause avalanche |
✅ Checklist Before Moving On
- [ ] I can explain Cache-Aside, Write-Through, and Write-Behind with trade-offs
- [ ] I know what LRU, LFU, and TTL eviction policies do and when to use each
- [ ] I can describe Cache Stampede, Cache Penetration, and Cache Avalanche — and their fixes
- [ ] I understand how Consistent Hashing distributes keys across cache nodes
- [ ] I know the difference between Redis and Memcached
- [ ] I can set correct
Cache-Controlheaders for static assets vs private API data - [ ] I know how to monitor a cache (hit rate, memory, eviction rate)
NOTE
Next in the series: Level 5 — Messaging & Queues covers Kafka, RabbitMQ, message patterns, and how async queues decouple services.
