🐦 System Design: Twitter / X (Social Media Feed)
One of the most common and complex system design questions.
Step 1: Requirements
Functional
- Post a tweet (text, images, video)
- Follow users
- View home timeline (tweets from people you follow)
- Like, Retweet, Reply
- Search tweets
Non-Functional
- 500M users, 200M daily active users
- 500M tweets posted/day
- Read:Write = 100:1 (reading is far more common)
- Latency: Feed load < 200ms
- High availability > 99.99%
Step 2: Capacity Estimation
text
Tweets: 500M/day = ~5,800 tweets/sec
Timeline: 28B reads/day = ~325,000 reads/sec
Storage:
Each tweet: ~300 bytes
500M/day × 300 bytes = 150 GB/day text
Plus media: much more!
Bandwidth:
325,000 reads/sec × 300 bytes = ~97 MB/s readStep 3: Core Problem — The Fan-Out
When a user with 10M followers posts a tweet, who sees it and when?
USER A posts a tweet
▼
10 MILLION followers need to see this in their timeline
Option 1: Pull Model (Fan-out on read)
Store tweet once
When user opens feed: query all followed users' tweets, merge, sort
✅ Storage efficient
❌ Timeline generation is SLOW (query millions of users)
Option 2: Push Model (Fan-out on write)
When tweet is posted: write to every follower's timeline cache
✅ Feed is pre-computed, instant read
❌ Expensive for celebrities (10M writes per tweet!)
Twitter's Solution: HYBRID
Regular users: Fan-out on write (fast feed)
Celebrities (>1M followers): Fan-out on read (save storage)Step 4: High-Level Architecture
Example: Tweet Service Publishing to Kafka
javascript
import { Kafka } from "kafkajs";
const kafka = new Kafka({
clientId: "tweet-service",
brokers: ["kafka1:9092", "kafka2:9092"],
});
const producer = kafka.producer();
async function postTweet(userId, content) {
// 1. Save to Cassandra (Abstracted)
const tweetId = await saveToCassandra(userId, content);
// 2. Publish event to Kafka for Fan-Out Workers
await producer.connect();
await producer.send({
topic: "new-tweets",
messages: [
{
key: String(userId), // Partition by userId
value: JSON.stringify({
tweetId,
userId,
content,
timestamp: Date.now(),
}),
},
],
});
return tweetId;
}Step 5: Tweet Storage Schema
tweets table (Cassandra — optimized for high writes):
tweet_id TIMEUUID (sortable by time)
user_id BIGINT
content TEXT
media_urls LIST<TEXT>
like_count COUNTER
created_at TIMESTAMP
user_timeline (Redis - precomputed feed per user):
Key: "timeline:{user_id}"
Value: Sorted Set of tweet_ids (sorted by timestamp)
Size: Last 800 tweet IDs per userStep 6: Home Timeline Flow
Example: Timeline Service Cache Logic
javascript
import Redis from "ioredis";
const redis = new Redis();
async function getHomeTimeline(userId) {
const cacheKey = `timeline:${userId}`;
// 1. Check Redis for precomputed feed
// ZREVRANGE to get most recent tweet IDs (highest timestamp score)
const tweetIds = await redis.zrevrange(cacheKey, 0, 19);
if (tweetIds && tweetIds.length > 0) {
console.log("Cache hit! Fetching details...");
return await fetchTweetDetailsFromCassandra(tweetIds);
}
// 2. Cache Miss: Compute on the fly (Fan-out on read)
console.log("Cache miss! Computing timeline...");
const followedUsers = await getFollowedUsers(userId);
let allTweets = [];
for (const followedUser of followedUsers) {
const recentTweets = await fetchRecentTweets(followedUser);
allTweets = allTweets.concat(recentTweets);
}
// Sort by timestamp descending
allTweets.sort((a, b) => b.timestamp - a.timestamp);
const topTweets = allTweets.slice(0, 800);
// 3. Store back to Redis cache
const pipeline = redis.pipeline();
for (const tweet of topTweets) {
pipeline.zadd(cacheKey, tweet.timestamp, tweet.id);
}
await pipeline.exec();
return topTweets.slice(0, 20); // Return first page
}Step 7: Search
Example: Elasticsearch Indexer Worker
javascript
import { Client } from "@elastic/elasticsearch";
import { Kafka } from "kafkajs";
const esClient = new Client({ node: "http://localhost:9200" });
const kafka = new Kafka({
clientId: "search-indexer",
brokers: ["localhost:9092"],
});
const consumer = kafka.consumer({ groupId: "search-indexing-group" });
async function runIndexer() {
await consumer.connect();
await consumer.subscribe({ topic: "new-tweets", fromBeginning: false });
await consumer.run({
eachMessage: async ({ topic, partition, message }) => {
const tweet = JSON.parse(message.value.toString());
// Index into Elasticsearch
await esClient.index({
index: "tweets",
id: tweet.tweetId,
document: {
userId: tweet.userId,
content: tweet.content,
timestamp: tweet.timestamp,
},
});
console.log(`Indexed tweet ${tweet.tweetId} for search`);
},
});
}Step 8: Notifications
Example: WebSocket Notification Service
javascript
import { WebSocketServer } from "ws";
import { Kafka } from "kafkajs";
const wss = new WebSocketServer({ port: 8080 });
const activeConnections = new Map(); // userId -> WebSocket
wss.on("connection", function connection(ws, req) {
// Assume userId is authenticated and extracted from req
const userId = getUserIdFromReq(req);
activeConnections.set(userId, ws);
ws.on("close", () => activeConnections.delete(userId));
});
const kafka = new Kafka({ clientId: "notifier", brokers: ["localhost:9092"] });
const consumer = kafka.consumer({ groupId: "notification-group" });
async function startNotificationConsumer() {
await consumer.connect();
await consumer.subscribe({
topic: "user-notifications",
fromBeginning: false,
});
await consumer.run({
eachMessage: async ({ message }) => {
const notification = JSON.parse(message.value.toString());
const { targetUserId, title, body } = notification;
// Push in-app notification if user is currently connected
const ws = activeConnections.get(targetUserId);
if (ws) {
ws.send(JSON.stringify({ type: "NOTIFICATION", title, body }));
}
},
});
}📊 Summary
| Component | Technology |
|---|---|
| API Gateway | Nginx / Kong |
| Tweet Storage | Cassandra (write-heavy) |
| User Data | MySQL (relational) |
| Feed Cache | Redis Sorted Sets |
| Fan-out | Kafka + Worker Pool |
| Search | Elasticsearch |
| Media | S3 + CDN (CloudFront) |
| Notifications | Kafka + Push services |
Key insight: Twitter's hardest problem is the fan-out. The hybrid model for celebrity accounts is the critical design decision.
