🎯 System Design Interview — Master Template
Use this framework for every system design interview. Interviewers expect structure.
TIP
New to Case Studies? Read our Guidelines on How to Manage & Study Case Studies first to maximize your learning.
⏱️ 45-Minute Interview Timeline
📋 Step-by-Step Framework
Step 1: Clarify Requirements (5 min)
ALWAYS ask these before drawing anything:
Functional:
"What are the core features? Can I focus on X, Y, Z?"
"Do we need auth? Analytics? Admin panel?"
Scale:
"How many users? Daily active users?"
"Read-heavy or write-heavy?"
"Is this global or a single region?"
Constraints:
"Any tech constraints? Existing infrastructure?"
"What latency targets matter?"
"How much data retention? Any compliance requirements?"Step 2: Back-of-Envelope Estimates (5 min)
Template:
DAU (Daily Active Users): ___
Requests/sec (peak): DAU × actions_per_day / 86,400
Storage/day: requests × data_per_request
Storage total: storage/day × retention_days
Bandwidth: requests/sec × data_per_request
Example for Twitter-scale:
DAU: 200M
Tweets/day: 500M → ~5,800/sec
Storage/tweet: 300 bytes
Storage/day: 500M × 300 = 150 GB/day
5-year storage: 150 GB × 365 × 5 = ~274 TBStep 3: High-Level Design (5 min)
Draw these 6 boxes first, always:
Example: Simple App Server wrapping Cache & DB
javascript
const express = require("express");
const app = express();
const cache = require("./myCacheLayer"); // e.g., Redis client
const db = require("./myDatabaseLayer"); // e.g., Postgres client
app.get("/api/v1/user/:id", async (req, res) => {
const userId = req.params.id;
try {
// 1. Check Cache first (The Load Balancer routes here)
const cachedUser = await cache.get(`user:${userId}`);
if (cachedUser) {
return res.json(JSON.parse(cachedUser));
}
// 2. Cache Miss: Query Database
const user = await db.query("SELECT * FROM users WHERE id = ?", [userId]);
if (!user) return res.status(404).send("Not found");
// 3. Populate Cache for next time
await cache.setex(`user:${userId}`, 3600, JSON.stringify(user));
// 4. Return to Client
res.json(user);
} catch (error) {
res.status(500).send("Server Error");
}
});Step 4: API Design (5 min)
REST Convention:
POST /resource→ createGET /resource/:id→ readPUT /resource/:id→ update (full)PATCH /resource/:id→ update (partial)DELETE /resource/:id→ delete
Example: URL Shortener API Routes
javascript
const express = require("express");
const router = express.Router();
// CREATE: POST /api/v1/urls
router.post("/urls", async (req, res) => {
const { longUrl } = req.body;
const shortCode = await urlService.shorten(longUrl);
res.status(201).json({ shortUrl: `http://short.ly/${shortCode}`, longUrl });
});
// READ: GET /api/v1/urls/:shortCode (Actually redirects)
router.get("/urls/:shortCode", async (req, res) => {
const longUrl = await urlService.resolve(req.params.shortCode);
res.redirect(302, longUrl);
});
// DELETE: DELETE /api/v1/urls/:shortCode
router.delete("/urls/:shortCode", async (req, res) => {
// Requires authentication/authorization
await urlService.delete(req.params.shortCode);
res.status(204).send(); // 204 No Content
});
module.exports = router;Step 5: Database Design (5 min)
Questions to answer:
1. What data do I need to store?
2. SQL or NoSQL? Why?
3. What are the access patterns? (query shapes)
4. What indexes do I need?
5. How do I handle relationships?
SQL (PostgreSQL/MySQL):
- User accounts
- Transactional data (orders, payments)
- Complex relational queries
NoSQL:
- Cassandra: write-heavy, time-series (messages, events)
- Redis: cache, sessions, leaderboards, geolocation
- Elasticsearch: full-text search
- MongoDB: flexible schema, document storage
- DynamoDB: serverless, managed, key-valueStep 6: Scale & Reliability (10 min)
Scaling Checklist:
□ Load balancer in front of app servers
□ Multiple availability zones
□ Database replication (read replicas)
□ Caching layer (Redis)
□ CDN for static assets
□ Async processing (message queue)
□ Rate limiting
Reliability Checklist:
□ Circuit breakers between services
□ Retry with exponential backoff
□ Health checks + auto-restart
□ Data backups + disaster recovery
□ Graceful degradationStep 7: Trade-offs (5 min)
Every design choice has trade-offs. Name yours:
"I chose Cassandra for message storage because of its write throughput,
but the trade-off is that complex JOIN queries aren't supported."
"I used eventual consistency in the feed service for performance,
but users might see slightly stale data for a few seconds."
"I'm pre-computing feeds on write (fan-out on write), which uses more storage
but makes reads instant. For celebrity accounts, I switch to fan-out on read."🚫 Common Mistakes to Avoid
| Mistake | Fix |
|---|---|
| Jumping to solution without clarifying | Always spend 5 min on requirements |
| Not thinking about scale | Do back-of-envelope estimates |
| Single point of failure | Add redundancy everywhere |
| Forgetting the cache | Most read-heavy systems need Redis |
| Ignoring failure modes | "What if the DB goes down?" |
| Overly complex from the start | Start simple, scale incrementally |
| Not discussing trade-offs | Every choice has costs — say them |
💡 Common Patterns to Memorize
🏆 Systems to Master (Priority Order)
Beginner
- URL Shortener ← Start here
- Pastebin
- Rate Limiter
Intermediate
- Twitter/Instagram feed
- WhatsApp/Chat Application
- File storage (Dropbox/Google Drive)
Advanced
- Uber/ride-sharing
- Netflix/Video Streaming
- E-Commerce Platform (Microservices) ← Full system design
- Search autocomplete
- Distributed key-value store (like DynamoDB)
- Web crawler
- Ad click aggregation
📊 Case Study Quick Reference
| Case Study | Difficulty | Category | Key Patterns |
|---|---|---|---|
| URL Shortener | 🟢 Beginner | Read-Heavy | Base62, Redis Cache, ID Generator |
| Chat Application | 🟡 Intermediate | Real-Time | WebSocket, Kafka, Cassandra, Fan-out |
| Video Streaming | 🔴 Advanced | Bandwidth-Intensive | CDN, HLS/DASH, DAG Transcoding |
| E-Commerce Microservices | 🔴 Advanced | Write-Heavy, Event-Driven | Saga, Circuit Breaker, DDD, Kafka |
