Skip to content

⚡ Level 4 — Caching: Full Explanation

Source files:Overview and CDN Optimization


What is Caching?

Caching is the act of storing the result of an expensive operation in a fast-access layer so that future requests for the same result can be served immediately, without repeating the work.

IMPORTANT

Caching is not about storing data permanently. It is a performance optimization — a shortcut that trades slightly stale data (or memory) for drastically lower latency and reduced backend load.

The core principle is the 80/20 rule (Pareto Principle) applied to data access: roughly 80% of all requests ask for the same 20% of data. If you can serve that 20% from a cache, you eliminate 80% of your database traffic.


Part A — Caching Layers

Modern systems have caching at every tier of the stack. A request can be served by a cache before it even reaches your application server.

LayerWhere It LivesLatencyBest For
BrowserUser's browser memory/disk~0msStatic assets, API responses with proper headers
CDNEdge servers worldwide~5–20ms (local POP)Images, videos, CSS, JS, public API responses
Load BalancerNginx/HAProxy in-memory~0.1msRepeated identical requests (e.g., homepage)
ApplicationIn-process Map or LRU~0ms (RAM)Config values, rate limit counters, hot data
DistributedRedis / Memcached cluster~0.5–2msSession data, user profiles, computed results
DatabaseDB engine's buffer pool~1–5msQuery result sets, index pages

TIP

When a request arrives, your system should check caches from the top (browser) down to the bottom (database). The earlier in the chain you get a hit, the cheaper and faster the response.


Part B — Cache Reading Strategies

How and when does data get read from or populated into the cache?

4.1 Cache-Aside (Lazy Loading)

The most common pattern. The application manages the cache manually. The cache is populated only when a cache miss occurs — "lazily."

✅ Pros:

  • Cache only contains data that was actually requested — no wasted memory.
  • Cache failures don't break the app (it falls back to DB).
  • Works great for read-heavy workloads.

❌ Cons:

  • First request for any key is always slow (cache miss penalty).
  • Stale data if the DB is updated without invalidating the cache key.

💻 JS Example: Cache-Aside Pattern

javascript
const redis = require("redis");
const db = require("./db");

const cache = redis.createClient({ url: "redis://localhost:6379" });

async function getUserById(userId) {
  const cacheKey = `user:${userId}`;

  // ── Step 1: Check the cache first ──
  const cached = await cache.get(cacheKey);
  if (cached) {
    console.log("⚡ Cache HIT — returned in <1ms");
    return JSON.parse(cached);
  }

  // ── Step 2: Cache miss — go to the database ──
  console.log("🐢 Cache MISS — querying database...");
  const user = await db.query("SELECT * FROM users WHERE id = $1", [userId]);
  if (!user.rows[0]) return null;

  // ── Step 3: Populate cache for next time (TTL = 1 hour) ──
  await cache.setEx(cacheKey, 3600, JSON.stringify(user.rows[0]));

  return user.rows[0];
}

// ── On write: invalidate the cache key ──
async function updateUser(userId, updates) {
  await db.query("UPDATE users SET name = $1 WHERE id = $2", [
    updates.name,
    userId,
  ]);
  // Delete stale cache entry so the next read fetches fresh data
  await cache.del(`user:${userId}`);
}

4.2 Read-Through Cache

The cache itself is responsible for fetching from the database on a miss. The app only ever talks to the cache — the cache acts as a transparent proxy.

✅ Pros: Simpler application code — the app doesn't need to know about the DB. ❌ Cons: Cache misses are still slow; initial warm-up requires real traffic.

NOTE

Cache-Aside vs Read-Through: In Cache-Aside, the app holds all the logic. In Read-Through, the cache library handles the miss and the DB call. Libraries like node-cache-manager support read-through out of the box.


Part C — Cache Writing Strategies

How does data get written to both the cache and the database?

4.3 Write-Through Cache

Every write goes to the cache AND the database at the same time before the operation is considered complete.

✅ Pros: Cache is always consistent with the DB. No stale reads after writes. ❌ Cons: Write latency increases (must wait for both). Cache stores data that may never be read (write amplification).

Best for: Systems where stale reads are unacceptable (e.g., financial balances, inventory counts).

💻 JS Example: Write-Through Pattern

javascript
async function writeThrough(key, value, ttlSeconds) {
  // Both writes must succeed — wrap in a transaction concept
  await db.query(
    "INSERT INTO kv_store (k, v) VALUES ($1, $2) ON CONFLICT (k) DO UPDATE SET v = $2",
    [key, value]
  );
  await cache.setEx(key, ttlSeconds, JSON.stringify(value));
  console.log(`✅ Write-through complete for key: ${key}`);
}

4.4 Write-Behind (Write-Back) Cache

Writes go to the cache immediately and the database is updated asynchronously later (in batches).

✅ Pros: Extremely fast writes (just a cache set). DB writes can be batched for efficiency. ❌ Cons: Risk of data loss if the cache crashes before flushing. Complexity of the async queue.

Best for: High write-throughput, loss-tolerant data — counters, analytics events, view counts.


4.5 Write-Around Cache

Writes go directly to the database, bypassing the cache entirely. The cache is only populated on a future read miss.

✅ Pros: Prevents the cache from being flooded with write-once data that won't be re-read. ❌ Cons: First read after write is always a cache miss.

Best for: Write-heavy data that's rarely re-read (e.g., logs, batch import jobs).


Strategy Comparison

StrategyWrite LatencyRead after WriteData Loss RiskBest For
Cache-AsideFast (DB only)Stale until TTLNoneRead-heavy; general purpose
Read-ThroughFast (DB only)Stale until TTLNoneSimplifying app code
Write-ThroughSlow (cache + DB)Always freshNoneConsistency critical
Write-BehindVery fast (cache)Fresh⚠️ If cache crashesHigh-throughput writes
Write-AroundFast (DB only)Miss on first readNoneWrite-once, rarely-read data

Part D — Cache Eviction Policies

When a cache is full, which key gets removed to make room for the new one?

Eviction Policy Breakdown

PolicyFull NameHow It DecidesReal-World Analogy
LRULeast Recently UsedEvict the key not accessed for the longest timeThrow out the book you haven't opened in the longest time
LFULeast Frequently UsedEvict the key accessed the fewest total timesRemove the least-watched movie
FIFOFirst In, First OutEvict the oldest inserted keyQueue at a store — first in, first served (out)
RandomRandomEvict a random keyBlindly pull a card from a deck
TTLTime-To-LiveEvict keys whose expiry time has passedFood with an expiration date
MRUMost Recently UsedEvict the most recently accessed keyNiche: useful for "once accessed, rarely again" patterns

TIP

Redis eviction policies are configured via maxmemory-policy. Common choices:

  • allkeys-lru — Evict any LRU key (recommended for general caching)
  • volatile-lru — Only evict keys with a TTL set (protects permanent keys)
  • allkeys-lfu — LFU across all keys (better for very skewed access patterns)
  • noeviction — Return errors when full (use when data loss is unacceptable)

💻 JS Example: LRU Cache Implementation

javascript
class LRUCache {
  constructor(capacity) {
    this.capacity = capacity;
    this.cache = new Map(); // Map preserves insertion order in JS
  }

  get(key) {
    if (!this.cache.has(key)) return -1;

    // Refresh to "most recently used": delete and re-insert at tail
    const value = this.cache.get(key);
    this.cache.delete(key);
    this.cache.set(key, value);
    return value;
  }

  put(key, value) {
    if (this.cache.has(key)) {
      this.cache.delete(key); // Remove old position
    } else if (this.cache.size >= this.capacity) {
      // Evict the LRU key — the first entry in Map (oldest)
      const lruKey = this.cache.keys().next().value;
      this.cache.delete(lruKey);
      console.log(`🗑️ Evicted LRU key: ${lruKey}`);
    }
    this.cache.set(key, value);
  }
}

// ── Usage ──
const cache = new LRUCache(3);
cache.put("a", 1); // Cache: a
cache.put("b", 2); // Cache: a, b
cache.put("c", 3); // Cache: a, b, c
cache.get("a"); // Access "a" → moves to end: b, c, a
cache.put("d", 4); // "b" evicted (LRU). Cache: c, a, d

Part E — Cache Pitfalls & How to Solve Them

4.6 Cache Stampede (Thundering Herd)

A hot cache key expires. Simultaneously, thousands of requests pour in — all miss the cache and all query the database at once, potentially crashing it.

Solutions:

Solution 1: Mutex Lock (Prevents Duplicate DB Calls)

javascript
const locks = new Map();

async function getWithMutex(key, fetchFn, ttl) {
  // Check cache first
  const cached = await cache.get(key);
  if (cached) return JSON.parse(cached);

  // If another request is already fetching, wait and retry
  if (locks.has(key)) {
    await new Promise((resolve) => setTimeout(resolve, 50));
    return getWithMutex(key, fetchFn, ttl); // Retry
  }

  // We are the first: acquire the lock
  locks.set(key, true);
  try {
    const data = await fetchFn(); // Only ONE call to the DB
    await cache.setEx(key, ttl, JSON.stringify(data));
    return data;
  } finally {
    locks.delete(key); // Always release the lock
  }
}

Solution 2: Probabilistic Early Expiration (XFetch)

Recompute the cache value slightly before it actually expires, so expiry never causes a stampede:

javascript
async function getWithEarlyExpiry(key, fetchFn, ttl, beta = 1) {
  const stored = await cache.get(key);
  if (stored) {
    const { value, expiry, computeTime } = JSON.parse(stored);
    const now = Date.now() / 1000;

    // XFetch formula: recompute early if randomized condition is met
    const shouldRecompute =
      now - beta * computeTime * Math.log(Math.random()) >= expiry;
    if (!shouldRecompute) return value;
  }

  const start = Date.now();
  const freshData = await fetchFn();
  const computeTime = (Date.now() - start) / 1000;

  await cache.setEx(
    key,
    ttl,
    JSON.stringify({
      value: freshData,
      expiry: Date.now() / 1000 + ttl,
      computeTime,
    })
  );

  return freshData;
}

4.7 Cache Penetration

Requests for keys that never exist in the database (e.g., user:9999999) will always miss the cache and hammer the DB.

Solution: Cache Null Results + Bloom Filter

javascript
// ── Solution 1: Cache the null result with a short TTL ──
async function getUserSafe(userId) {
  const cacheKey = `user:${userId}`;
  const cached = await cache.get(cacheKey);

  if (cached !== null) {
    // "null" stored as a string means "DB confirmed this doesn't exist"
    return cached === "NULL" ? null : JSON.parse(cached);
  }

  const user = await db.query("SELECT * FROM users WHERE id = $1", [userId]);

  if (!user.rows[0]) {
    // Cache the null — so subsequent requests for id=9999999 don't hit DB
    await cache.setEx(cacheKey, 60, "NULL"); // Short 60s TTL
    return null;
  }

  await cache.setEx(cacheKey, 3600, JSON.stringify(user.rows[0]));
  return user.rows[0];
}

NOTE

A Bloom Filter is a probabilistic data structure that can answer "does this key definitely NOT exist?" in O(1) with zero false negatives. Services like Redis (via RedisBloom) can check this before any cache or DB lookup, rejecting invalid IDs at the edge.


4.8 Cache Avalanche

Many cache keys expire at the same time (e.g., after a cache restart), causing a mass wave of DB hits simultaneously.

Solution: Jittered TTLs

Instead of all keys expiring at TTL = 3600, add random jitter so expirations are spread out:

javascript
function getJitteredTTL(baseTTL, jitterPercent = 0.2) {
  // e.g., base = 3600s, jitter ±20% → TTL is between 2880s and 4320s
  const jitter = baseTTL * jitterPercent;
  return Math.floor(baseTTL + (Math.random() * 2 - 1) * jitter);
}

async function setWithJitter(key, value, baseTTL) {
  const ttl = getJitteredTTL(baseTTL);
  await cache.setEx(key, ttl, JSON.stringify(value));
  console.log(`Set ${key} with TTL: ${ttl}s`);
}

// Example: 1000 keys set with base TTL of 1 hour
// → Will expire spread across 48 min to 72 min, not all at 60 min
for (let i = 0; i < 1000; i++) {
  await setWithJitter(`product:${i}`, productData[i], 3600);
}

4.9 Cache Consistency — The Invalidation Problem

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

When the source of truth (the database) changes, the cache must be updated or invalidated, or clients get stale data.

Invalidation Strategies:

StrategyHowProsCons
Delete on writeDEL key after every DB writeAlways fresh on next readOne extra miss per write
Update on writeSET key newValue after DB writeNo miss penaltyCache & DB write must be atomic
TTL-basedLet keys expire naturallySimpleStale for up to TTL duration
Event-drivenDB publishes change events, app listens and invalidatesDecoupled, scalableComplex infrastructure (CDC, Kafka)

💻 JS Example: Event-Driven Invalidation with Redis Pub/Sub

javascript
// ── Publisher (inside your write API) ──
async function updateUserAndPublish(userId, updates) {
  await db.query("UPDATE users SET name = $1 WHERE id = $2", [
    updates.name,
    userId,
  ]);

  // Notify all app servers to invalidate their caches
  await redisPublisher.publish(
    "cache:invalidate",
    JSON.stringify({ key: `user:${userId}` })
  );
}

// ── Subscriber (runs on every app server) ──
redisSubscriber.subscribe("cache:invalidate", (message) => {
  const { key } = JSON.parse(message);
  cache.del(key);
  console.log(`🗑️ Cache invalidated for key: ${key}`);
});

Part F — Distributed Caching

When a single cache server can't hold all data or handle all traffic, you need distributed caching — spreading keys across multiple cache nodes.

4.10 Consistent Hashing

The challenge: if you have 3 cache nodes and add a 4th, a naive hash(key) % N approach would remap every single key (catastrophic cache invalidation). Consistent hashing minimizes remapping.

The key insight: When you add or remove a node, only the keys in its adjacent zone need to be remapped — not all keys. With 3 nodes → 4 nodes, only ~25% of keys move (not 100%).

💻 JS Example: Consistent Hashing (Simplified)

javascript
const crypto = require("crypto");

class ConsistentHashRing {
  constructor(nodes = [], replicas = 150) {
    this.replicas = replicas; // Virtual nodes per real node
    this.ring = new Map();
    this.sortedKeys = [];

    nodes.forEach((node) => this.addNode(node));
  }

  hash(key) {
    return parseInt(
      crypto.createHash("md5").update(key).digest("hex").slice(0, 8),
      16
    );
  }

  addNode(node) {
    for (let i = 0; i < this.replicas; i++) {
      const virtualKey = this.hash(`${node}:${i}`);
      this.ring.set(virtualKey, node);
      this.sortedKeys.push(virtualKey);
    }
    this.sortedKeys.sort((a, b) => a - b);
  }

  getNode(key) {
    const hash = this.hash(key);
    // Walk the ring clockwise to find the first node >= hash
    for (const ringKey of this.sortedKeys) {
      if (hash <= ringKey) return this.ring.get(ringKey);
    }
    // Wrap around to the first node
    return this.ring.get(this.sortedKeys[0]);
  }
}

const ring = new ConsistentHashRing([
  "redis-node-1",
  "redis-node-2",
  "redis-node-3",
]);

console.log(ring.getNode("user:42")); // → redis-node-2
console.log(ring.getNode("product:99")); // → redis-node-1
console.log(ring.getNode("session:abc")); // → redis-node-3

// Add a 4th node — only ~25% of keys remapped
ring.addNode("redis-node-4");
console.log(ring.getNode("user:42")); // → might now be redis-node-4

Part G — Redis Deep Dive

Redis is the industry-standard distributed cache. It is far more than a simple key-value store.

4.11 Redis Data Structures

Data StructureRedis CommandUse Case
StringSET / GETSession tokens, simple counters, serialized JSON
HashHSET / HGETUser objects with many fields (avoid JSON re-serialization)
ListLPUSH / RPOPActivity feeds, message queues, recent history
SetSADD / SMEMBERSUnique visitor tracking, tag systems
Sorted SetZADD / ZRANGELeaderboards, priority queues, rate limiting
HyperLogLogPFADD / PFCOUNTApproximate unique count (e.g., daily active users)
Pub/SubPUBLISH / SUBSCRIBEReal-time notifications, cache invalidation broadcast
StreamXADD / XREADAppend-only log, event sourcing

💻 JS Example: Redis Sorted Set for a Leaderboard

javascript
const redis = require("redis");
const client = redis.createClient();

async function updateScore(userId, score) {
  // ZADD: Add userId with their score to the "game:leaderboard" sorted set
  await client.zAdd("game:leaderboard", { score, value: `user:${userId}` });
}

async function getTopPlayers(count = 10) {
  // ZREVRANGE: Get top N players, highest score first, WITH their scores
  const leaders = await client.zRangeWithScores(
    "game:leaderboard",
    0,
    count - 1,
    { REV: true }
  );
  return leaders.map((entry, i) => ({
    rank: i + 1,
    userId: entry.value.replace("user:", ""),
    score: entry.score,
  }));
}

async function getUserRank(userId) {
  // ZREVRANK: Get the 0-based rank of a user (reversed = highest first)
  const rank = await client.zRevRank("game:leaderboard", `user:${userId}`);
  return rank !== null ? rank + 1 : null; // 1-indexed
}

// ── Demo ──
await updateScore("alice", 9500);
await updateScore("bob", 8200);
await updateScore("carol", 9800);

console.log(await getTopPlayers(3));
// → [ { rank: 1, userId: "carol", score: 9800 },
//     { rank: 2, userId: "alice", score: 9500 },
//     { rank: 3, userId: "bob", score: 8200 } ]

console.log(await getUserRank("alice")); // → 2

4.12 Redis vs Memcached

FeatureRedisMemcached
Data TypesRich: Strings, Hashes, Lists, Sets, Sorted Sets, HyperLogLog, StreamsStrings only
PersistenceOptional (RDB snapshots, AOF log)❌ In-memory only
Replication✅ Primary-Replica❌ No
Clustering✅ Redis Cluster (16,384 hash slots)Limited (client-side sharding)
Pub/Sub✅ Built-in❌ No
Lua Scripting✅ Atomic server-side scripts❌ No
Memory EfficiencyGoodSlightly better for pure string caching
Best ForLeaderboards, sessions, queues, pub/sub, rate limitingSimple, high-throughput string/object caching

NOTE

In practice, Redis has won. Memcached is simpler and marginally more memory-efficient for pure string caching, but Redis's richer feature set means almost all new systems choose Redis. AWS ElastiCache and Google Cloud Memorystore both support Redis.


Part H — CDN as a Cache

A Content Delivery Network is a globally distributed cache layer for static and semi-static content. See CDN Optimization for the full deep dive.

How CDN caching works:

  1. First user in Tokyo requests logo.png → CDN misses → fetches from Virginia origin → caches at Tokyo edge.
  2. Every subsequent Tokyo user gets logo.png from the Tokyo edge in ~5ms instead of ~200ms.
  3. Cache is controlled by Cache-Control headers from your origin server.

HTTP Cache-Control Headers

javascript
// Express.js: Set proper cache headers on your responses

// ── Static assets (never change — fingerprinted filenames) ──
app.use(
  "/static",
  express.static("public", {
    maxAge: "1y", // Cache for 1 year
    immutable: true, // Tell browser: "This will never change"
    // Cache-Control: public, max-age=31536000, immutable
  })
);

// ── API response that can be cached for 5 minutes ──
app.get("/api/products", (req, res) => {
  res.set({
    "Cache-Control": "public, max-age=300, stale-while-revalidate=60",
    //                ^^^^^^  ^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^^
    // Anyone can cache | 5 min fresh  | Serve stale for 60s while refreshing
    ETag: `"products-v${dataVersion}"`, // For conditional requests
  });
  res.json(products);
});

// ── Private user data — never CDN cache ──
app.get("/api/user/me", (req, res) => {
  res.set("Cache-Control", "private, no-store");
  res.json(currentUser);
});

Part I — Real-World Caching Architecture

4.13 Full-Stack Caching: E-commerce Example

Here is how a real e-commerce system like Amazon applies caching at every tier for a product page:

Result: A product page that gets 1 million views/day:

  • ~95% served from CDN (zero app server cost)
  • ~4% served from Redis (sub-millisecond)
  • ~1% hits the database (after initial warm-up)

Part J — Cache Sizing & Monitoring

4.14 How to Size Your Cache

Use the 80/20 working set rule:

Working Set Size = Total Data Size × 0.20

Example: If your users table is 50GB total, the "hot" 20% that gets queried most is ~10GB. Size your Redis instance to hold at least that.

javascript
// Monitor cache efficiency with hit ratio
async function getCacheStats() {
  const info = await redis.info("stats");
  const lines = info.split("\r\n");

  const hits = parseInt(
    lines.find((l) => l.startsWith("keyspace_hits"))?.split(":")[1] || "0"
  );
  const misses = parseInt(
    lines.find((l) => l.startsWith("keyspace_misses"))?.split(":")[1] || "0"
  );

  const hitRatio = hits / (hits + misses);
  console.log(`Cache Hit Ratio: ${(hitRatio * 100).toFixed(2)}%`);

  // Alert if hit ratio drops below 90% — cache may be too small or TTLs too short
  if (hitRatio < 0.9) {
    console.warn(
      "⚠️ Low cache hit ratio — consider increasing cache size or TTL"
    );
  }

  return { hits, misses, hitRatio };
}

4.15 Key Metrics to Monitor

MetricWhat It MeansHealthy Target
Hit Ratehits / (hits + misses)> 90%
Memory Usageused_memory / maxmemory< 80% (leave headroom)
Eviction RateKeys evicted per secondShould be near 0
Command Latencyp99 latency for GET/SET< 1ms
Connected ClientsNumber of open connectionsWithin configured limit
Replication LagReplica behind primary< 10ms

The Golden Rules of Caching

#PrincipleWhy
1Cache reads, not writesReads are 10–100x more common; caching writes adds complexity with little gain
2Always set a TTLWithout expiry, stale data lives forever and you eventually run out of memory
3Jitter your TTLsPrevents the Cache Avalanche when many keys were set at the same time
4Invalidate on writeDelete the cache key immediately after a DB update — don't wait for TTL
5Cache at the right layerUse the CDN for public data, Redis for private data, browser cache for assets
6Monitor your hit rateA hit rate < 90% means your cache is not doing its job — investigate
7Don't cache everythingWrite-once, never-read data wastes memory and can cause avalanche

✅ Checklist Before Moving On

  • [ ] I can explain Cache-Aside, Write-Through, and Write-Behind with trade-offs
  • [ ] I know what LRU, LFU, and TTL eviction policies do and when to use each
  • [ ] I can describe Cache Stampede, Cache Penetration, and Cache Avalanche — and their fixes
  • [ ] I understand how Consistent Hashing distributes keys across cache nodes
  • [ ] I know the difference between Redis and Memcached
  • [ ] I can set correct Cache-Control headers for static assets vs private API data
  • [ ] I know how to monitor a cache (hit rate, memory, eviction rate)

NOTE

Next in the series: Level 5 — Messaging & Queues covers Kafka, RabbitMQ, message patterns, and how async queues decouple services.

Released under the ISC License.