How to Architect a Scalable Video Streaming Platform

Want to design the system architecture of a video streaming platform like YouTube? Ever wondered what various components are needed and what it takes to create such a application at scale?

Specifications

To design such an application we will consider the below:

Functional Requirements
Non-functional Requirements
Capacity planning (Back-of-envelope calculations)
High-level design
Deep dive

Functional Requirements

Below are the functional requirements that the system must address:

Creator requirements: The creators/uploaders should be able to upload any video. The video needs to be made available for all locations. Latency for upload, processing and publishing is acceptable. Once published, the video should have high availability.
Viewer requirements: Viewer should be able to view videos (high availability). Video should be compatible across multiple device types. Must be available for all network speeds. User should be able to search videos. User should have home feed (recommendation engine).

Non-functional Requirements

Let's make the following assumptions:

We will assume we have 100M Daily Active Users (DAU's) using this application.
Read:Write ratio is 100:1. Each user watches 100 videos per day while each creator uploads 1 video per day.
Each video is 500MB in size.
We would retain the data for 10 years.
Video loading should have low latency.
We should be able to scale globally and handle for more user spikes.
System should have high availibility - should never be down for both users and creators.

Capacity Planning (Back-of-envelope calculations)

Let's calculate the QPS and storage for this system:

QPS (Queries per seconds)

QPS = Total number of query requests per day / (Number of seconds in a day)

Assuming 100 million Daily Active Users (DAU), with each user performing one write operation per day and a Read-to-Write ratio of 100:1, the total daily read requests would be 100 million × 100 = 10 billion. This results in a Read QPS (Queries Per Second) of approximately 10 billion ÷ (24 × 3600) ≈ 115,740.

Storage

Storage = Total data volume per day * Number of days retained

Assuming each video is 500MB and there are 100 million write requests per day, the daily data volume amounts to 100 million × 500MB. To retain data for 10 years, the total storage needed is: 100 million × 500MB × 365 days × 10 years ≈ 183EB.

Bottlenecks

To identify bottlenecks, it is essential to analyze peak traffic periods and user distribution patterns. The primary bottleneck is likely to occur in video delivery, which can be addressed through the use of Content Delivery Networks (CDNs) and effective load balancing across the infrastructure.

When estimating CDN costs, the primary consideration is content delivery. With a read-to-write ratio of 100:1, the majority of CDN expenses will stem from video streaming. However, implementing aggressive caching strategies and leveraging geographic distribution can significantly optimize these costs.

High Level Design

For this we will take a modular approach adopting microservices architecture of smaller interconnected services. This allows us for independent scaling, deployment and better fault isolation.

When the user views the videos: When a user requests to watch a video, the request flows through the Load Balancer and API Gateway to the Video Playback Service. This service first checks caching layers optimized for fast retrieval before querying the Video Metadata Store to obtain the video URL. Once the URL is fetched, the video is streamed directly from the nearest CDN (Content Delivery Network) node to the user's device, ensuring minimal latency and smooth playback.

The CDN plays a critical role by delivering cached video content from geographically distributed nodes close to the user, significantly improving load times and overall viewing quality.
The Metadata Database manages essential details such as video titles, descriptions, and user interactions (likes, comments, etc.). These databases are designed to handle large volumes of read operations efficiently.

When the user uploads a video: Video uploads follow a separate flow. The process starts with the Load Balancer and API Gateway routing the upload request to the Video Upload Service.

Signed URL Generation: The Video Upload Service requests a signed URL from the Object Storage Service, enabling secure, time-bound access to object storage platforms like Amazon S3, Google Cloud Storage, or Azure Blob Storage. The signed URL allows the client to upload files without burdening the application servers.
Direct Upload: The client application layer uses the signed URL to upload the video file directly to the object storage, bypassing the main application servers. This approach enhances scalability by reducing server load.
Upload Confirmation & Metadata Submission: Once the upload completes, the client notifies the Video Upload Service and provides relevant metadata, triggering the next set of operations.
Video Processing Pipeline: Uploaded videos undergo processing, including content moderation, transcoding (to support multiple formats and resolutions), compression, and thumbnail generation.
CDN Distribution: Finally, the processed video files are uploaded to CDN nodes, making them readily available to end users from the most optimal locations, ensuring fast and reliable playback.

Deep dive - Low Level Design

Low-Level Design (LLD) delves into the detailed technical implementation of the system. It outlines how different components interact, specifies the data structures, class architectures, and defines the design of APIs.

Service Modules

Video Upload Service: The Video Upload Service collaborates with the Video Storage Service and the Transcoding Service. Once a video is uploaded, it is stored as raw content in an object storage system (e.g., AWS S3). The Transcoding Service then processes the raw video into various formats and resolutions to ensure compatibility across devices. The upload service exposes endpoints such as POST /upload-video to handle incoming video files along with their metadata.
Video Streaming Service: Once the transcoding process is complete, the Video Streaming Service handles delivering the video to end users. It retrieves the video content from the distributed storage system and streams it in formats and resolutions optimized for the user's device and network conditions. Provides an endpoint such as GET /video/{video_id}/stream, which delivers video chunks for seamless streaming playback.

Class Structure and Object-Oriented Design

User Class: Represents a user within the system, containing attributes such as user_id, username, email, password_hash, subscriptions, watch_history, and preferences. Key methods include:

upload_video()
like_video()
subscribe_to_channel()
create_playlist()

Video Class: Represents a video entity with attributes like video_id, user_id (indicating the uploader), title, description, tags, views_count, upload_timestamp, and additional metadata. Key methods include:

get_video_info()
increase_views()
add_comment()
transcode_video()

Database Schema

We can use relational database for videos and users. We can use non-relational database (like NoSql) for video recommendations, comments etc.

Scalability and Fault Tolerance

Service Decomposition: Adopting a microservices architecture enables independent scaling of core components such as video uploads, search, and streaming, ensuring efficient resource utilization based on demand.
Distributed Caching: Implementing caching layers (e.g., Redis) helps store frequently accessed data — such as video metadata, trending content, and user preferences — to deliver faster response times and reduce database load.
Database Sharding: To manage large datasets effectively, databases are partitioned into smaller, distributed shards across multiple servers, improving scalability and the system's ability to handle high data volumes.

Security and Authentication

Authentication: Leverage secure protocols like OAuth 2.0 or JSON Web Tokens (JWT) to authenticate users and protect access to the system.
Authorization: Implement Role-Based Access Control (RBAC) to manage user permissions and ensure that only authorized users can perform specific actions.
Data Encryption: Apply end-to-end encryption to safeguard video content and sensitive user data during storage and transmission.

These represent just a few of the key design elements that must be taken into account when building the architecture of a video streaming platform.

Author:
Rahul Majumdar

Scalable video streaming architectureVideo streaming system designYouTube-like platform architectureMicroservices for video streamingCDN optimization for video delivery