0% completed
Here is the system design for YouTube, a global video sharing platform.
1. Problem Definition and Scope
We are designing a large-scale video streaming platform where users can upload content and watch videos on demand. The system must support massive storage ingestion and low-latency global streaming.
-
Main User Groups:
- Creators: Users who upload raw videos and manage channel metadata.
- Viewers: Users who search for and watch videos (the vast majority).
-
Scope:
- In Scope: Video Upload (ingestion), Video Processing (transcoding), Streaming (playback), Search, and Metadata (likes, views).
- Out of Scope: Live streaming, complex recommendation algorithms (Home Feed), monetization/ads, and social graph features (subscriptions/comments).
2. Clarify functional requirements
Must Have:
-
Upload Video: Users can upload raw video files. The system processes them to be playable on any device.
-
Stream Video: Users can watch videos with minimal buffering.
-
Adaptive Quality: Video quality (e.g., 360p, 720p, 4K) automatically adjusts based on the user's internet speed.
-
Search: Users can find videos by title or tags.
-
Stats: Users can like videos and see view counts.
Nice to Have:
- Resumable Uploads: If an upload drops, it can resume from where it stopped.
- Watch History: Users can resume videos from where they left off.
3. Clarify non-functional requirements
-
Scale: Support 1 Billion Daily Active Users (DAU).
-
Read vs. Write: Extremely read-heavy. Ratio is approx 1:1000 (1 upload for every 1000 views).
-
Latency:
- Uploads: High latency is acceptable (minutes). It is an async process.
- Streaming: Low latency is critical. Time To First Byte (TTFB) should be < 200ms.
-
Availability: 99.99%. Playback is the core product and must never fail.
-
Consistency:
- Metadata: Strong consistency for the uploader (they see their video immediately).
- View Counts: Eventual consistency (it is acceptable if the view count lags by a few minutes).
-
Data Retention: Videos are stored forever unless deleted by the creator.
4. Back of the envelope estimates
-
Storage (Writes):
- Metric: 500 hours of video uploaded per minute (typical YouTube scale).
- 500 \text{ hours} \times 60 \text{ min} \times 24 \text{ hours} = 720,000 \text{ hours/day}
- Assume 1 hour of raw video 1 GB (compressed raw).
- Raw Storage: 720 TB/day.
- Transcoding: We store multiple versions (360p, 720p, 1080p, 4K). This multiplies storage by ~3x.
- Total New Storage: \approx 2 PB (Petabytes) per day.
-
Bandwidth (Reads):
- Metric: 5 Billion views per day.
- Assume average data per view is 100 MB (mix of mobile/desktop).
- Total Egress: 5 \text{ Billion} \times 100 \text{ MB} = 500 \text{ PB/day}.
- Throughput: 500 \text{ PB} / 86,400 \text{ seconds} \approx 5.8 TB/s (Terabytes per second).
- Conclusion: This traffic volume requires a massive, global Content Delivery Network (CDN). We cannot serve this from a central database.
5. API design
We will use a REST API. Note that video binary data is not sent to the API server directly.
1. Initiate Upload
POST /videos/init- Request:
{"title": "My Cat", "size": 1048576, "format": "mp4"} - Response:
{ "video_id": "vid_123", "upload_url": "https://s3-upload-signed-url..." }
- Note: Returns a Presigned URL to upload the file directly to cloud storage.
2. Stream Video
GET /videos/{video_id}- Response:
{ "title": "My Cat", "stream_url": "https://cdn.youtube.com/vid_123/master.m3u8", "views": 10500, "likes": 200 }
- Note:
stream_urlpoints to a manifest file (playlist), not a single video file.
3. Search
GET /search?q=cat&page=1- Response: List of video summaries.
4. Like Video
POST /videos/{video_id}/like- Response:
200 OK
6. High-level architecture
We separate the system into the Control Plane (Metadata/API) and the Data Plane (Video Streaming).
Architecture Flow:
-
Client (Mobile/Web)
-
Load Balancer
-
API Gateway (Auth, Rate Limiting)
-
Microservices:
- Video Metadata Service
- User Service
- Search Service
-
Databases & Cache:
- SQL DB (Metadata)
- Redis (Hot Cache)
- Elasticsearch (Search Index)
-
Async Processing:
- Message Queue (Kafka)
- Transcoding Workers (GPU optimized servers)
-
Storage:
- Object Storage (S3 - Source of Truth)
- CDN (Edge Servers for streaming)
Component Roles:
-
CDN: Serves 99% of read traffic (video chunks).
-
Object Storage (S3): Stores the massive video files cheaply and reliably.
-
Transcoding Workers: A cluster of servers that convert raw video uploads into streamable formats (HLS/DASH).
-
Metadata DB: Stores text data (titles, descriptions, comments).
7. Data model
1. Relational Database (MySQL/PostgreSQL)
Used for critical metadata. Sharded by video_id.
- Videos Table:
id(PK),uploader_id,title,description,s3_bucket_path,status(processing/active/failed). - Users Table:
id,email,name.
2. NoSQL (Cassandra or DynamoDB) Used for massive write throughput counters.
- VideoStats:
video_id(PK),view_count,like_count. - Why? Locking a SQL row for every view on a viral video will crash the DB. NoSQL handles high-concurrency increments better.
3. Search Index (Elasticsearch)
- Stores video titles and tags to support fuzzy search (e.g., "cats" matches "cat").
8. Core flows end-to-end
We will describe the two most critical flows: Uploading and Watching.
To make this easy to understand, we will visualize the system as a massive industrial factory (The Upload Flow) and a smart delivery service (The Streaming Flow).
Flow A: The Upload Assembly Line
Goal: Transform a raw, heavy file from a user’s camera into multiple lightweight, streamable versions accessible worldwide.
This process is designed to be asynchronous.
The user should not have to wait for the video to process before they can leave the page.
Step 1: The "Smart" Bypass (Presigned URLs)
The Problem: If a user uploads a 1GB video directly to our Web Server (API), the server gets tied up receiving data for minutes. If 1,000 users do this simultaneously, the web servers crash.
The Solution: The API acts like a traffic controller, not a warehouse.
-
Request: The Client sends a request to the API: "I want to upload a file named cat.mp4."
-
Auth & Permission: The API checks if the user is logged in and allowed to upload.
-
Presigned URL: The API generates a special, temporary "key" to our Object Storage (S3). It tells the Client: "Don't give the file to me. Here is a secure link to put it directly into the Storage bin."
-
Direct Upload: The Client uploads the raw video directly to S3 (Amazon Simple Storage Service) or GCS (Google Cloud Storage). The Web Server is now free to handle other users.
Step 2: The Trigger (Message Queue)
-
The Handoff: Once the upload finishes, S3 triggers a notification to the API: "File upload complete."
-
The Buffer: The API does not start processing immediately (which might overload the system). Instead, it creates a Job (a to-do note) containing the video ID and file location.
-
Kafka/Queue: This Job is placed into a Message Queue (like Apache Kafka or RabbitMQ).
- Why? This acts as a shock absorber. If a viral event happens and 10,000 videos are uploaded in a minute, the Queue holds them safely. The processing servers pick them up one by one at a manageable pace.
Step 3: The Heavy Lifting (Transcoding & Chunking)
A fleet of Worker Servers (optimized with GPUs) constantly watches the Queue.
-
Pickup: A Worker grabs the next Job from the Queue.
-
Validation: It checks metadata (resolution, aspect ratio, audio codec).
-
Transcoding (The Conversion): The raw video is often in a format not suitable for web streaming (like
.movor high-bitrate.mp4). The worker converts this into standard web formats (H.264/H.265).
- It creates multiple quality buckets: 360p, 480p, 720p, 1080p, and 4K.
-
Segmentation (Chunking): This is the most critical step for streaming. The worker chops every quality version into tiny segments (usually 2 to 10 seconds long).
- Example: A 1-minute video becomes twelve 5-second files (segments).
-
Manifest Generation: The worker creates a Manifest File (often
.m3u8for HLS or.mpdfor DASH). This is a simple text file acting as an index. It tells the player: "If you want 1080p, look at files A, B, C. If you want 360p, look at files X, Y, Z."
Step 4: Distribution & Indexing
-
Storage: All the generated chunks and manifest files are saved back to Object Storage (S3).
-
CDN Push: The system may preemptively push these files to the Content Delivery Network (CDN); servers located physically close to users (e.g., in ISPs like Comcast or AT&T) to ensure fast loading later.
-
Database Update: The video status is updated from "Processing" to "Active."
-
Search Indexing: The video title and tags are sent to Elasticsearch so users can find the video immediately.
Flow B: The Streaming Experience
Goal: Play video instantly without buffering, regardless of whether the user is on fast WiFi or slow mobile data.
We use a technique called Adaptive Bitrate Streaming (ABS) over HTTP (HLS or DASH protocols).
Step 1: The Setup (Metadata & Security)
-
The Click: A user clicks on a video.
-
Metadata Check: The Client asks the API for video details. The API checks:
- Does the video exist?
- Is it private/deleted?
- Does the user have permission to view it (e.g., age restrictions)?
-
The Response: The API returns the video metadata and the URL to the Manifest File (stored on the CDN).
Step 2: The Menu (The Manifest)
The Video Player (in the browser or app) downloads the Manifest File first. It does not download the video yet.
- The Manifest is like a restaurant menu. It lists:
-
Option A: 4K Quality (High bandwidth cost).
-
Option B: 1080p Quality (Medium bandwidth cost).
-
Option C: 360p Quality (Low bandwidth cost).
-
Step 3: The Intelligence (Client-Side Logic)
The "brain" of the streaming process is actually on the user's device, not the server.
-
Bandwidth Test: The Player estimates the user's current internet speed.
-
Decision:
- Scenario: The user is on fast home WiFi.
- Action: The Player selects the 1080p track from the Manifest.
-
Fetching Chunks: The Player requests the first 3 chunks (15 seconds of video) from the closest CDN edge server.
Step 4: Adaptive Switching (The Magic)
This is what happens when network conditions change (e.g., walking out of range of WiFi):
-
Monitoring: While playing the 1080p chunks, the Player notices the download speed is dropping drastically.
-
Calculation: The Player calculates that if it continues downloading 1080p, the buffer will run out in 2 seconds, causing the video to freeze.
-
The Switch: For the next chunk request, the Player looks at the Manifest and switches to the 360p URL.
-
Seamless Transition:
- The video quality visibly drops (becomes pixelated).
- Crucially: The audio continues, and the video does not stop.
- The user continues watching uninterrupted.
Step 5: Latency & CDN
-
Edge Caching: When the user requests a chunk, the request goes to the nearest CDN node (e.g., a server in London).
-
Cache Miss: If the London server doesn't have the video yet (nobody in London has watched it), it fetches it from the Origin (S3 in the US), saves a copy, and serves it to the user.
-
Cache Hit: The next user in London gets the file instantly from the London server, achieving <200ms latency.
9. Caching and read performance
Caching is the only way to survive 500 PB of daily traffic.
-
CDN (Content Delivery Network):
-
We replicate video chunks to thousands of servers worldwide (Edge Locations).
-
When a user in London watches a video, they download chunks from a London server, not the US data center.
-
Eviction: LRU (Least Recently Used). Popular videos stay in cache; obscure ones are fetched from S3 on demand.
-
Metadata Cache (Redis):
-
We cache video details (Title, Author, Like Count).
-
Key:
video_meta:{id}. -
Strategy: Cache-aside. App checks Redis -> If miss, check DB -> Write to Redis.
10. Storage, indexing and media
-
Blob Storage (S3):
-
We use tiered storage.
-
Hot: Recent/Viral videos on standard S3.
-
Cold (Glacier): Original raw files of old videos with 0 views are moved to cheaper, slower storage to save millions of dollars.
-
Search Indexing:
-
When a video becomes "Active," a background service replicates the title/tags into Elasticsearch.
-
This allows full-text search (inverted index) which SQL cannot do efficiently.
11. Scaling strategies
-
Database Sharding:
-
We shard the Metadata DB by
video_id. -
Formula:
shard_id = hash(video_id) % total_shards. This distributes data evenly across machines. -
Parallel Transcoding (Split & Merge):
-
Processing a 4K movie on one server takes hours.
-
Strategy: We split the raw video into small segments (e.g., minute 1, minute 2).
-
Fan-out: We send these segments to 50 different workers to process in parallel.
-
Merge: A final worker stitches the results together. This reduces processing time from hours to minutes.
-
CDN Peering: We place our CDN servers directly inside Internet Service Providers (ISPs) to reduce traffic costs and latency.
12. Reliability, failure handling and backpressure
-
Dead Letter Queues (DLQ): If a video causes a worker to crash (e.g., corrupted file), we retry 3 times. If it still fails, we move it to a DLQ for manual inspection so the queue doesn't get blocked.
-
Thundering Herd Protection: If a celebrity uploads a video, millions might request it at once. The CDN uses Request Coalescing: the first request fetches from origin, and the other 1 million wait for that single response.
-
Circuit Breakers: If the "Comments" service fails, the video page still loads without comments. The core "Watch" feature is protected.
13. Security, privacy and abuse
-
Presigned URLs: Users never have write access to S3. They get a temporary, single-use URL to upload one specific file.
-
Digital Rights Management (DRM): For premium content, video chunks are encrypted. The player must request a decryption key to play them.
-
Content ID: During transcoding, we generate an "audio fingerprint" and match it against a database of copyrighted music to block illegal uploads.
14. Bottlenecks and next steps
-
Bottleneck: Bandwidth Cost.
-
Next Step: Use Per-Title Encoding. Analyze each video's complexity; a simple cartoon needs fewer bits than an action movie. Adjusting bitrate per video saves ~20% bandwidth.
-
Bottleneck: Search Freshness.
-
Next Step: Use a "Real-time" index for videos uploaded in the last hour, merged later with the main index.
Summary
- The system relies on CDNs and Adaptive Bitrate Streaming for performance.
- Async Queues and Parallel Processing handle the heavy write load.
- Sharding scales the metadata.
On this page
- Problem Definition and Scope
- Clarify functional requirements
- Clarify non-functional requirements
- Back of the envelope estimates
- API design
- High-level architecture
- Data model
- Core flows end-to-end
Flow A: The Upload Assembly Line
Flow B: The Streaming Experience
- Caching and read performance
- Storage, indexing and media
- Scaling strategies
- Reliability, failure handling and backpressure
- Security, privacy and abuse
- Bottlenecks and next steps
Summary