Skip to content

Instagram System Design: Scaling Media and Feeds

Instagram is a massive-scale social media platform focusing on photos and videos. Designing Instagram requires solving two major challenges: High-volume media storage and Efficient feed generation for billions of users.


1. Requirements

Functional

  • Post Photos/Videos: Users can upload media with captions.
  • Follow/Unfollow: Users can follow other users.
  • News Feed: Users see a timeline of posts from people they follow.
  • Likes & Comments: Users can interact with posts.
  • Stories: Ephemeral 24-hour posts.

Non-Functional

  • High Availability: The system must be always accessible.
  • Low Latency: Feed generation and photo viewing must be fast (< 200ms).
  • Scalability: Handle 100k+ uploads per second during peak times.
  • Durability: Photos must never be lost.

2. High-Level Architecture

Instagram uses a microservices architecture to decouple media processing from social graph management.


3. Technical Deep Dives

A. Media Upload Pipeline

At Instagram's scale, we don't process images synchronously.

  1. Upload: The client sends the photo to the Photo Service, which uploads the original file to S3.
  2. Asynchronous Processing: An S3 event triggers a background worker (via Kafka/SQS).
  3. Image Processor:
    • Resizes the image into multiple formats (Thumbnail, 1080p, 720p).
    • Applies filters if selected.
    • Stores processed versions back in S3.
  4. Metadata Update: Once processed, the worker updates the Metadata DB with the new photo URLs.

B. News Feed: The Hybrid Fan-out Model

Generating a feed for 1 billion users is the hardest problem. We use two models based on user popularity:

  1. Push Model (Fan-out on Write):

    • Used for "Regular" users.
    • When you post, we push your post ID into the pre-computed feed caches (Redis) of all your followers.
    • Pro: Viewing the feed is extremely fast (just a Redis read).
    • Con: Inefficient for celebrities with 50M+ followers.
  2. Pull Model (Fan-out on Load):

    • Used for "Celebrities" (e.g., Cristiano Ronaldo).
    • We do not push his posts to 500M+ people.
    • Instead, when a follower loads their feed, we "pull" the latest posts from any celebrities they follow and merge them into the feed.

C. Sharding Strategy: Scalable Metadata

Instagram uses PostgreSQL but scales it through custom sharding.

  • Data is sharded by user_id.
  • Every shard contains multiple "logical shards" allowing for easy migration as volume grows.
  • ID Generation: A custom Snowflake-like ID generator ensures unique IDs across shards without a central bottleneck.

4. Implementation Example: Feed Service

This example demonstrates how a Feed Service might handle the hybrid pulling of celebrity posts.

typescript
import { Redis } from "ioredis";

interface Post {
  id: string;
  userId: string;
  mediaUrl: string;
  timestamp: number;
}

class FeedService {
  private redis: Redis;

  constructor() {
    this.redis = new Redis();
  }

  /**
   * Retrieves the combined feed for a user
   */
  async getFeed(userId: string): Promise<Post[]> {
    // 1. Fetch pre-computed feed IDs from Redis (Regular users' posts)
    const regularPostIds = await this.redis.lrange(`feed:${userId}`, 0, 49);

    // 2. Identify Celebrities the user follows
    const celebrities = await this.followService.getFollowedCelebrities(userId);

    // 3. Pull latest posts from Celebrities (Fan-out on Load)
    const celebrityPosts = await Promise.all(
      celebrities.map((celebId) => this.postService.getLatestPosts(celebId, 5))
    );

    // 4. Merge and sort by timestamp
    const allPosts = [
      ...(await this.getPostsByIds(regularPostIds)),
      ...celebrityPosts.flat(),
    ];
    return allPosts.sort((a, b) => b.timestamp - a.timestamp);
  }

  private async getPostsByIds(ids: string[]): Promise<Post[]> {
    // Fetch full post objects from cache or DB
    return this.postCache.getMany(ids);
  }
}

5. Summary: Key Trade-offs

FeatureDesign ChoiceTrade-off
FeedHybrid Push/PullPulling for celebs reduces write-latency but adds complexity to feed retrieval.
ConsistencyEventual ConsistencyLikes/Comments might be slightly out of sync for a few seconds to ensure high availability.
StorageS3 + CDNIncreases infrastructure cost but ensures global low-latency viewing.

Released under the ISC License.