Skip to content

🐦 System Design: Twitter / X (Social Media Feed)

One of the most common and complex system design questions.


Step 1: Requirements

Functional

  • Post a tweet (text, images, video)
  • Follow users
  • View home timeline (tweets from people you follow)
  • Like, Retweet, Reply
  • Search tweets

Non-Functional

  • 500M users, 200M daily active users
  • 500M tweets posted/day
  • Read:Write = 100:1 (reading is far more common)
  • Latency: Feed load < 200ms
  • High availability > 99.99%

Step 2: Capacity Estimation

text
Tweets:      500M/day = ~5,800 tweets/sec
Timeline:    28B reads/day = ~325,000 reads/sec

Storage:
  Each tweet: ~300 bytes
  500M/day × 300 bytes = 150 GB/day text
  Plus media: much more!

Bandwidth:
  325,000 reads/sec × 300 bytes = ~97 MB/s read

Step 3: Core Problem — The Fan-Out

When a user with 10M followers posts a tweet, who sees it and when?

USER A posts a tweet

10 MILLION followers need to see this in their timeline

Option 1: Pull Model (Fan-out on read)
  Store tweet once
  When user opens feed: query all followed users' tweets, merge, sort
  ✅ Storage efficient
  ❌ Timeline generation is SLOW (query millions of users)

Option 2: Push Model (Fan-out on write)
  When tweet is posted: write to every follower's timeline cache
  ✅ Feed is pre-computed, instant read
  ❌ Expensive for celebrities (10M writes per tweet!)

Twitter's Solution: HYBRID
  Regular users: Fan-out on write (fast feed)
  Celebrities (>1M followers): Fan-out on read (save storage)

Step 4: High-Level Architecture

Example: Tweet Service Publishing to Kafka

javascript
import { Kafka } from "kafkajs";

const kafka = new Kafka({
  clientId: "tweet-service",
  brokers: ["kafka1:9092", "kafka2:9092"],
});
const producer = kafka.producer();

async function postTweet(userId, content) {
  // 1. Save to Cassandra (Abstracted)
  const tweetId = await saveToCassandra(userId, content);

  // 2. Publish event to Kafka for Fan-Out Workers
  await producer.connect();
  await producer.send({
    topic: "new-tweets",
    messages: [
      {
        key: String(userId), // Partition by userId
        value: JSON.stringify({
          tweetId,
          userId,
          content,
          timestamp: Date.now(),
        }),
      },
    ],
  });

  return tweetId;
}

Step 5: Tweet Storage Schema

tweets table (Cassandra — optimized for high writes):
  tweet_id    TIMEUUID (sortable by time)
  user_id     BIGINT
  content     TEXT
  media_urls  LIST<TEXT>
  like_count  COUNTER
  created_at  TIMESTAMP

user_timeline (Redis - precomputed feed per user):
  Key:   "timeline:{user_id}"
  Value: Sorted Set of tweet_ids (sorted by timestamp)
  Size:  Last 800 tweet IDs per user

Step 6: Home Timeline Flow

Example: Timeline Service Cache Logic

javascript
import Redis from "ioredis";

const redis = new Redis();

async function getHomeTimeline(userId) {
  const cacheKey = `timeline:${userId}`;

  // 1. Check Redis for precomputed feed
  // ZREVRANGE to get most recent tweet IDs (highest timestamp score)
  const tweetIds = await redis.zrevrange(cacheKey, 0, 19);

  if (tweetIds && tweetIds.length > 0) {
    console.log("Cache hit! Fetching details...");
    return await fetchTweetDetailsFromCassandra(tweetIds);
  }

  // 2. Cache Miss: Compute on the fly (Fan-out on read)
  console.log("Cache miss! Computing timeline...");
  const followedUsers = await getFollowedUsers(userId);

  let allTweets = [];
  for (const followedUser of followedUsers) {
    const recentTweets = await fetchRecentTweets(followedUser);
    allTweets = allTweets.concat(recentTweets);
  }

  // Sort by timestamp descending
  allTweets.sort((a, b) => b.timestamp - a.timestamp);
  const topTweets = allTweets.slice(0, 800);

  // 3. Store back to Redis cache
  const pipeline = redis.pipeline();
  for (const tweet of topTweets) {
    pipeline.zadd(cacheKey, tweet.timestamp, tweet.id);
  }
  await pipeline.exec();

  return topTweets.slice(0, 20); // Return first page
}

Example: Elasticsearch Indexer Worker

javascript
import { Client } from "@elastic/elasticsearch";
import { Kafka } from "kafkajs";

const esClient = new Client({ node: "http://localhost:9200" });
const kafka = new Kafka({
  clientId: "search-indexer",
  brokers: ["localhost:9092"],
});
const consumer = kafka.consumer({ groupId: "search-indexing-group" });

async function runIndexer() {
  await consumer.connect();
  await consumer.subscribe({ topic: "new-tweets", fromBeginning: false });

  await consumer.run({
    eachMessage: async ({ topic, partition, message }) => {
      const tweet = JSON.parse(message.value.toString());

      // Index into Elasticsearch
      await esClient.index({
        index: "tweets",
        id: tweet.tweetId,
        document: {
          userId: tweet.userId,
          content: tweet.content,
          timestamp: tweet.timestamp,
        },
      });
      console.log(`Indexed tweet ${tweet.tweetId} for search`);
    },
  });
}

Step 8: Notifications

Example: WebSocket Notification Service

javascript
import { WebSocketServer } from "ws";
import { Kafka } from "kafkajs";

const wss = new WebSocketServer({ port: 8080 });
const activeConnections = new Map(); // userId -> WebSocket

wss.on("connection", function connection(ws, req) {
  // Assume userId is authenticated and extracted from req
  const userId = getUserIdFromReq(req);
  activeConnections.set(userId, ws);

  ws.on("close", () => activeConnections.delete(userId));
});

const kafka = new Kafka({ clientId: "notifier", brokers: ["localhost:9092"] });
const consumer = kafka.consumer({ groupId: "notification-group" });

async function startNotificationConsumer() {
  await consumer.connect();
  await consumer.subscribe({
    topic: "user-notifications",
    fromBeginning: false,
  });

  await consumer.run({
    eachMessage: async ({ message }) => {
      const notification = JSON.parse(message.value.toString());
      const { targetUserId, title, body } = notification;

      // Push in-app notification if user is currently connected
      const ws = activeConnections.get(targetUserId);
      if (ws) {
        ws.send(JSON.stringify({ type: "NOTIFICATION", title, body }));
      }
    },
  });
}

📊 Summary

ComponentTechnology
API GatewayNginx / Kong
Tweet StorageCassandra (write-heavy)
User DataMySQL (relational)
Feed CacheRedis Sorted Sets
Fan-outKafka + Worker Pool
SearchElasticsearch
MediaS3 + CDN (CloudFront)
NotificationsKafka + Push services

Key insight: Twitter's hardest problem is the fan-out. The hybrid model for celebrity accounts is the critical design decision.

Released under the ISC License.