Skip to content

🎬 Netflix System Design & Architecture

A deep-dive into how Netflix serves 260M+ subscribers across 190 countries, streaming ~15% of global internet traffic at peak hours.


📐 Table of Contents

  1. High-Level Architecture Overview
  2. Client Layer
  3. API Gateway & Load Balancing
  4. Microservices Architecture
  5. Video Encoding Pipeline
  6. Content Delivery Network (Open Connect)
  7. Data Storage Strategy
  8. Recommendation Engine
  9. Real-Time Streaming Data Pipeline
  10. Fault Tolerance & Resilience
  11. Security Architecture
  12. Capacity & Scale Estimates
  13. Complete System Flow
  14. Technology Stack Summary

1. High-Level Architecture Overview

Netflix is a cloud-native platform built entirely on AWS, using a microservices architecture. It is split into three major zones:

  • Client Zone — Apps (TV, mobile, web, game consoles)
  • Backend Zone — AWS-hosted microservices + data stores
  • CDN Zone — Open Connect Appliances (OCA) deployed at ISPs globally

2. Client Layer

Netflix supports 2000+ device types. Each client handles adaptive streaming differently but follows a common protocol.

Key Responsibilities

ComponentRole
Player SDKAdaptive bitrate (ABR) switching using custom algorithms
DASH / HLSStreaming protocols for video delivery
DRM ClientWidevine (Android), FairPlay (Apple), PlayReady (Microsoft)
Pre-fetchingDownloads next episode proactively on Wi-Fi

Example: Adaptive Bitrate Logic

User starts stream at 4K (15 Mbps)
  └─> Network drops to 5 Mbps
        └─> Player detects buffer under-run
              └─> Switches to 1080p (8 Mbps) segment
                    └─> Network recovers
                          └─> Gradually steps back up to 4K

3. API Gateway & Load Balancing

Netflix open-sourced much of its infrastructure stack.

Netflix OSS Stack

ToolFunction
ZuulEdge proxy, routing, auth, rate limiting
EurekaService discovery registry
RibbonClient-side load balancing
HystrixCircuit breaker pattern
ArchaiusDynamic property management
FeignDeclarative HTTP client

Example Flow — User Clicks "Play"

1. Client sends GET /playback?contentId=tt1234 to Zuul
2. Zuul validates JWT token with Auth Service
3. Zuul looks up Playback Service instances in Eureka
4. Ribbon selects healthy instance (least connections)
5. Playback Service fetches manifest from Content Metadata
6. Returns CDN URLs for video chunks → Client starts streaming

4. Microservices Architecture

Each Netflix service owns its data and is independently deployable.

Decomposition Strategy

Netflix uses domain-driven design to split services:

netflix-services/
├── user-service/         # Signup, login, profile CRUD
├── billing-service/      # Subscriptions, payments (Stripe)
├── content-service/      # Titles, metadata, thumbnails
├── recommendation-svc/   # ML-based recommendations
├── search-service/       # ElasticSearch-backed search
├── playback-service/     # Streaming manifest, DRM
├── encoding-service/     # Video transcoding pipeline
└── notification-service/ # Email, push, in-app alerts

5. Video Encoding Pipeline

Netflix encodes every title into ~1200+ format variants for different devices, resolutions, and codecs.

Encoding Details

CodecUse CaseCompression
H.264 (AVC)Legacy devices, wide compatBaseline
H.265 (HEVC)4K HDR content40% better than H.264
VP9Android, Chrome35% better than H.264
AV1New smart TVs, future-proof50% better than H.264

Per-Encoding Profiles (example for 1 title)

Resolution   Bitrate       Codec    File Size
-----------------------------------------------
240p         0.235 Mbps    H.264    ~100 MB
360p         0.560 Mbps    H.264    ~240 MB
480p         1.050 Mbps    H.264    ~450 MB
720p         2.800 Mbps    H.264    ~1.2 GB
1080p        5.800 Mbps    H.265    ~1.5 GB (HEVC savings)
4K HDR      15.600 Mbps    AV1      ~3.0 GB (AV1 savings)
-----------------------------------------------
Total for 2hr movie: ~120 variants × avg 800MB = ~96 GB/title

6. Content Delivery Network (Open Connect)

Netflix built its own CDN — Open Connect — instead of paying Akamai/Cloudflare.

How Steering Works

User clicks Play in New York:
1. DNS query → Netflix Steering Service
2. Netflix checks: which OCA is closest to user's IP?
3. Also checks: what's the OCA's load and cache hit rate?
4. Returns CDN URL pointing to nearest healthy OCA
5. Client fetches video chunks from OCA directly

If OCA Cache Miss:
OCA → IXP cluster → Origin (S3) → fill OCA cache → serve user

Open Connect Appliances (OCAs)

TierHardwareStorageLocation
Small36TB HDDCache popular contentISP co-location
Large100TB HDD + SSDCache entire catalogIXP data centers
FlashAll-NVMe SSDUltra-low latencyTop-tier ISPs

7. Data Storage Strategy

Netflix uses polyglot persistence — the right database for each use case.

Database Selection Rationale

DatabaseUse CaseWhy?
MySQLBilling, user accountsACID, strong consistency needed
CassandraViewing history, ratingsWrite-heavy, globally distributed, eventual consistency OK
RedisSessions, rate limit countersSub-millisecond reads, TTL support
ElasticSearchTitle search, autocompleteFull-text search, faceting
S3Video files, thumbnailsInfinite storage, cheap, durable
KafkaEvent streamingHigh-throughput, durable log

Cassandra Data Model Example

Table: viewing_history
Partition Key: user_id
Clustering Key: watched_at (DESC)

user_id     | watched_at           | content_id | progress_pct | device
------------+----------------------+------------+--------------+--------
usr_123     | 2024-01-15 20:30:00  | tt0944947  | 45%          | TV
usr_123     | 2024-01-14 21:00:00  | tt0903747  | 100%         | Mobile
usr_123     | 2024-01-13 19:45:00  | tt1254207  | 23%          | Web

8. Recommendation Engine

Netflix's recommendation system drives ~80% of content watched.

Recommendation Algorithms

AlgorithmDescriptionExample
Collaborative FilteringUsers with similar taste"Users who watched Stranger Things also liked Dark"
Content-BasedMatch content attributesGenre, actors, director, tone
Matrix FactorizationLatent factor decompositionFinds hidden preference patterns
Contextual BanditsReal-time explorationTests new content on similar users
Thumbnail PersonalizationShows best image per userAction fan sees action shot vs. romance fan sees love scene

Thumbnail A/B Test Example

Title: "The Crown"
├── Thumbnail A: Queen Elizabeth II portrait  → CTR: 3.2%
├── Thumbnail B: Scene of drama/conflict      → CTR: 4.8%
└── Thumbnail C: Family sitting together      → CTR: 2.9%

Winner: Thumbnail B → served to all users (until next test)
Personalization: History + period drama fans get Thumbnail A

9. Real-Time Streaming Data Pipeline

Netflix processes ~700 billion events per day using a unified streaming platform called Keystone.

Event Types & Volume

Event TypeApprox Volume/DayUse Case
play_start50MPlayback analytics
buffer_event200MCDN quality monitoring
view_segment400MProgress tracking
search_query20MSearch improvement
thumbnail_impression5BA/B test measurement
payment_event2MBilling reconciliation

10. Fault Tolerance & Resilience

Netflix invented the Chaos Engineering discipline and created tools to validate resilience.

Circuit Breaker Example

Recommendation Service calls ML Model Service:

CLOSED state (normal):
  Request → ML Model → Success → Return recommendations

OPEN state (ML Model failing >50% in 10s):
  Request → Circuit OPEN → Return cached/fallback recommendations
  (Fallback: "Popular in your region" list)

HALF-OPEN state (after 30s):
  1 probe request → ML Model →
  Success? → CLOSE circuit
  Fail?    → OPEN again for another 30s

Fallback Strategy Hierarchy

1. Primary: Personalized ML recommendations
   ↓ (if fails)
2. Secondary: Pre-computed popular list per genre
   ↓ (if fails)
3. Tertiary: Global trending (cached in Redis, 5 min TTL)
   ↓ (if fails)
4. Static: Hardcoded editorial picks (always works)

11. Security Architecture


12. Capacity & Scale Estimates

Traffic Assumptions

Users:           260 million subscribers
Daily active:    ~130 million (50% DAU)
Peak concurrent: ~20 million streams at once
Average bitrate: 5 Mbps (mix of 1080p / 4K)

Bandwidth Calculation

Peak bandwidth = 20M streams × 5 Mbps
              = 100,000 Gbps
              = 100 Tbps (at peak)

Netflix = ~15% of global internet traffic at peak

Storage Calculation

Catalog size:        ~36,000 titles
Encoding variants:   ~1,200 per title
Average variant:     ~1.5 GB
Total storage:       36,000 × 1,200 × 1.5 GB
                   = ~65 Petabytes (video only)

User data (history, profiles): ~5 PB
Event logs (Kafka + S3):       ~20 PB/year

Request Rates

API Gateway (Zuul):    ~2 million requests/sec (peak)
Kafka events:          ~8 million events/sec
Redis operations:      ~50 million ops/sec
Cassandra reads:       ~20 million reads/sec

13. Complete System Flow: "User Presses Play"


14. Technology Stack Summary

CategoryTechnologies
CloudAWS (EC2, S3, RDS, Lambda, ECS)
API GatewayZuul 2.0
Service DiscoveryEureka
Load BalancingRibbon, AWS ALB
Circuit BreakerHystrix, Resilience4j
MessagingApache Kafka (Keystone)
Stream ProcessingApache Flink, Spark Streaming
Batch ProcessingApache Spark, Hive
DatabasesCassandra, MySQL, CockroachDB
CacheRedis, Memcached, EVCache
SearchElasticSearch
CDNOpen Connect (proprietary)
Video CodecsH.264, H.265, VP9, AV1
DRMWidevine, FairPlay, PlayReady
MonitoringAtlas, Spectator, Mantis
Chaos EngSimian Army (Chaos Monkey, etc.)
LanguagesJava, Python, JavaScript, Go
ML/AITensorFlow, PyTorch (recommendations)
ContainerDocker, Titus (Netflix's own orchestrator)

Sources: Netflix Tech Blog (netflixtechblog.com), AWS re:Invent sessions, Netflix OSS GitHub repositories.

Released under the ISC License.