System Design
Learn System Design
Introduction to System Design
How to Learn System Design?
Functional vs. Non-functional Requirements
What are Back-of-the-Envelope Estimations?
Things to Avoid During System Design Interview
System Design Basics
Load Balancing
Introduction to Load Balancing
Load Balancing Algorithms
Uses of Load Balancing
Load Balancer Types
Stateless vs. Stateful Load Balancing
High Availability and Fault Tolerance
Scalability and Performance
Challenges of Load Balancers
API Gateway
Introduction to API Gateway
Usage of API gateway
Advantages and disadvantages of using API gateway
Key Characteristics of Distributed Systems
Scalability
Availability
Latency and Performance
Concurrency and Coordination
Monitoring and Observability
Resilience and Error Handling
Fault Tolerance vs. High Availability
Network Essentials
HTTP vs. HTTPS
TCP vs. UDP
HTTP: 1.0 vs. 1.1 vs 2.0 vs. 3.0
URL vs. URI vs. URN
Domain Name System (DNS)
Introduction to DNS
DNS Resolution Process
DNS Load Balancing and High Availability
Caching
Introduction to Caching
Why is Caching Important?
Types of Caching
Cache Replacement Policies
Cache Invalidation
Cache Read Strategies
Cache Coherence and Consistency Models
Caching Challenges
Cache Performance Metrics
CDN
What is CDN?
Origin Server vs. Edge Server
CDN Architecture
Push CDN vs. Pull CDN
Data Partitioning
Introduction to Data Partitioning
Partitioning Methods
Data Sharding Techniques
Benefits of Data Partitioning
Common Problems Associated with Data Partitioning
Proxies
What is a Proxy Server?
Uses of Proxies
VPN vs. Proxy Server
Redundancy and Replication
What is Redundancy?
What is Replication?
Replication Methods
Data Backup vs. Disaster Recovery
CAP & PACELC Theorems
Introduction to CAP Theorem
Components of CAP Theorem
Trade-offs in CAP Theorem
Examples of CAP Theorem in Practice
Beyond CAP Theorem
System Design Trade-offs in Interviews
Databases (SQL vs. NoSQL)
Introduction to Databases
SQL Databases
NoSQL Databases
SQL vs. NoSQL
ACID vs BASE Properties
Real-World Examples and Case Studies
SQL Normalization and Denormalization
In-Memory Database vs. On-Disk Database
Data Replication vs. Data Mirroring
Database Federation
Indexes
What are Indexes?
Types of Indexes
Bloom Filters
Introduction to Bloom Filters
Benefits & Limitations of Bloom Filters
Variants and Extensions of Bloom Filters
Applications of Bloom Filters
Long-Polling vs. WebSockets vs. Server-Sent Events
Difference Between Long-Polling, WebSockets, and Server-Sent Events
Quorum
Why Quorum?
What is Quorum?
Heartbeat
What is Heartbeat?
Checksum
What is Checksum?
Uses of Checksum
Leader and Follower
What is Leader and Follower Pattern?
Security
What is Security and Privacy?
What is Authentication?
What is Authorization?
Authentication vs. Authorization
OAuth vs. JWT for Authentication
What is Encryption?
What are DDoS Attacks?
Distributed Messaging System
Introduction to Messaging System
Introduction to Kafka
Messaging patterns
Popular Messaging Queue Systems
RabbitMQ vs. Kafka vs. ActiveMQ
Scalability and Performance
Distributed File Systems
What is a Distributed File System?
Architecture of a Distributed File System
Key Components of a DFS
Misc Concepts
Batch Processing vs. Stream Processing
XML vs. JSON
Synchronous vs. Asynchronous Communication
Push vs. Pull Notification Systems
Microservices vs. Serverless Architecture
Message Queues vs. Service Bus
Stateful vs. Stateless Architecture
Event-Driven vs. Polling Architecture
Quiz - System Design Fundamentals
Quiz
System Design Trade-offs
Importance of Discussing Trade-offs
Strong vs Eventual Consistency
Latency vs Throughput
ACID vs BASE Properties in Databases
Read-Through vs Write-Through Cache
Batch Processing vs Stream Processing
Load Balancer vs. API Gateway
API Gateway vs Direct Service Exposure
Proxy vs. Reverse Proxy
API Gateway vs. Reverse Proxy
SQL vs. NoSQL
Primary-Replica vs Peer-to-Peer Replication
Data Compression vs Data Deduplication
Server-Side Caching vs Client-Side Caching
REST vs RPC
Polling vs. Long-Polling vs. WebSockets vs. Webhooks
CDN Usage vs Direct Server Serving
Serverless Architecture vs Traditional Server-based
Stateful vs Stateless Architecture
Hybrid Cloud Storage vs All-Cloud Storage
Token Bucket vs Leaky Bucket
Read Heavy vs Write Heavy System
Quiz
System Design Master Template
System Design Interviews - A step by step guide
System Design Master Template
Designing a URL Shortening Service like TinyURL
Designing a URL Shortening Service like TinyURL
Quiz - Designing URL Shortner
Designing Pastebin
Designing Pastebin
Quiz - Designing Pastebin
Designing Instagram
Designing Instagram
Quiz - Designing Instagram
Designing Dropbox
Designing Dropbox
Quiz - Designing Dropbox
Designing Facebook Messenger
Designing Facebook Messenger
Quiz - Designing Facebook Messenger
Designing Twitter
Designing Twitter
Quiz - Designing Twitter
Designing Youtube or Netflix
Designing Youtube or Netflix
Quiz - Designing Youtube
Designing Typeahead Suggestion
Designing Typeahead Suggestion
Quiz - Designing Typeahead Suggestion
Designing an API Rate Limiter
Designing an API Rate Limiter
Quiz - Designing an API Rate Limiter
Designing Twitter Search
Designing Twitter Search
Quiz - Designing Twitter Search
Designing a Web Crawler
Designing a Web Crawler
Quiz - Designing a Web Crawler
Designing Yelp or Nearby Friends
Designing Yelp or Nearby Friends
Quiz - Designing Yelp or Nearby Friends
Designing Uber backend
Designing Uber backend
Quiz - Designing Uber backend
Designing Ticketmaster
Designing Ticketmaster
Quiz - Designing Ticketmaster
Dynamo: How to design a key value store?
Dynamo: Introduction
High-Level Architecture
Data Partitioning
Replication
Vector Clocks and Conflicting Data
The Life of Dynamo’s put() & get() Operations
Anti-entropy Through Merkle Trees
Gossip Protocol
Dynamo Characteristics and Criticism
Summary: Dynamo
Quiz: Dynamo
Mock Interview: Dynamo
Designing YouTube Likes Counter (medium)
YouTube Likes Counter
Quiz
Cassandra: How to Design a Wide-column NoSQL Database?
Cassandra: Introduction
High-level Architecture
Replication
Cassandra Consistency Levels
Gossiper
Anatomy of Cassandra's Write Operation
Anatomy of Cassandra's Read Operation
Compaction
Tombstones
Summary: Cassandra
Quiz: Cassandra
Mock Interview: Cassandra
Kafka: How to Design a Distributed Messaging System?
Messaging Systems: Introduction
Kafka: Introduction
High-level Architecture
Kafka: Deep Dive
Consumer Groups
Kafka Workflow
Role of ZooKeeper
Controller Broker
Kafka Delivery Semantics
Kafka Characteristics
Summary: Kafka
Quiz: Kafka
Mock Interview: Kafka
Chubby: How to Design a Distributed Locking Service?
Chubby: Introduction
High-level Architecture
Design Rationale
How Chubby Works
File, Directories, and Handles
Locks, Sequencers, and Lock-delays
Sessions and Events
Master Election and Chubby Events
Caching
Database
Scaling Chubby
Summary: Chubby
Quiz: Chubby
Mock Interview: Chubby
HDFS: How to Design File Storage System?
Hadoop Distributed File System: Introduction
High-level Architecture
Deep Dive
Anatomy of a Read Operation
Anatomy of a Write Operation
Data Integrity & Caching
Fault Tolerance
HDFS High Availability (HA)
HDFS Characteristics
Summary: HDFS
Quiz: HDFS
Mock Interview: HDFS
GFS: How to Design a Distributed File System Storage?
Google File System: Introduction
High-level Architecture
Single Master and Large Chunk Size
Metadata
Master Operations
Anatomy of a Read Operation
Anatomy of a Write Operation
Anatomy of an Append Operation
GFS Consistency Model and Snapshotting
Fault Tolerance, High Availability, and Data Integrity
Garbage Collection
Criticism on GFS
Summary: GFS
Quiz: GFS
Mock Interview: GFS
BigTable: How to Design a Wide Column Storage System?
BigTable: Introduction
BigTable Data Model
System APIs
Partitioning and High-level Architecture
SSTable
GFS and Chubby
Bigtable Components
Working with Tablets
The Life of BigTable's Read & Write Operations
Fault Tolerance and Compaction
BigTable Refinements
BigTable Characteristics
Summary: BigTable
Quiz: BigTable
Mock Interview: BigTable
Designing Reddit (medium)
Design Reddit
Quiz
Designing Notification Service (medium)
Designing a Notification System
Quiz
Design Google Calendar (medium)
Design Google calendar (Medium)
Quiz
Design a Recommendation System (medium)
Design a Recommendation System for Netflix
Quiz
Designing Gmail (medium)
Design Gmail
Quiz
Designing Google News (medium)
Design Google News, a Global News Aggregator System (Medium)
Quiz
Designing Unique ID Generator (medium)
Design Unique ID Generator (Easy)
Quiz
Designing Code Judging System (medium)
Design Code Judging System like LeetCode (Medium)
Quiz
Designing Payment System (hard)
Design Payment System
Quiz
Designing Flash Sale System (hard)
Design a Flash Sale for an E-commerce Site (Hard)
Quiz
Designing Reminder Alert System (hard)
Design a Reminder Alert System
Quiz
System Design Patterns
Introduction: System Design Patterns
1. Bloom Filters
2. Consistent Hashing
3. Quorum
4. Leader and Follower
5. Write-ahead Log
6. Segmented Log
7. High-Water Mark
8. Lease
9. Heartbeat
10. Gossip Protocol
11. Phi Accrual Failure Detection
12. Split Brain
13. Fencing
14. Checksum
15. Vector Clocks
16. CAP Theorem
17. PACELC Theorem
18. Hinted Handoff
19. Read Repair
20. Merkle Trees
Quiz
Designing Facebook’s Newsfeed
news feed system
fan-out
real-time communication
+2
1. What is Facebook’s newsfeed?
A Newsfeed is the constantly updating list of stories in the middle of Facebook’s homepage. It includes status updates, photos, videos, links, app activity, and 'likes' from people, pages, and groups that a user follows on Facebook. In other words, it is a compilation of a complete scrollable version of your friends' and your life story from photos, videos, locations, status updates, and other activities.
For any social media site you design - Twitter, Instagram, or Facebook - you will need some newsfeed system to display updates from friends and followers.
Try it yourself
Before looking at the solution, try designing it:
2. Requirements and Goals of the System
Let’s design a newsfeed for Facebook with the following requirements:
Functional requirements:
- Newsfeed will be generated based on the posts from the people, pages, and groups that a user follows.
- A user may have many friends and follow a large number of pages/groups.
- Feeds may contain images, videos, or just text.
- Our service should support appending new posts as they arrive to the newsfeed for all active users.
Non-functional requirements:
- Our system should be able to generate any user’s newsfeed in real-time - maximum latency seen by the end user would be 2s.
- A post shouldn’t take more than 5s to make it to a user's feed assuming a new newsfeed request comes in.
3. Capacity Estimation and Constraints
Let’s assume on average a user has 300 friends and follows 200 pages.
Traffic estimates: Let’s assume 300M daily active users with each user fetching their timeline an average of five times a day. This will result in 1.5B newsfeed requests per day or approximately 17,500 requests per second.
Storage estimates: On average, let’s assume we need to have around 500 posts in every user’s feed that we want to keep in memory for a quick fetch. Let’s also assume that on average each post would be 1KB in size. This would mean that we need to store roughly 500KB of data per user. To store all this data for all the active users we would need 150TB of memory. If a server can hold 100GB we would need around 1500 machines to keep the top 500 posts in memory for all active users.
4. System APIs
💡 Once we have finalized the requirements, it's always a good idea to define the system APIs. This should explicitly state what is expected from the system.
We can have SOAP or REST APIs to expose the functionality of our service. The following could be the definition of the API for getting the newsfeed:
getUserFeed(api_dev_key, user_id, since_id, count, max_id, exclude_replies)
Parameters:
api_dev_key (string): The API developer key of a registered can be used to, among other things, throttle users based on their allocated quota.
user_id (number): The ID of the user for whom the system will generate the newsfeed.
since_id (number): Optional; returns results with an ID higher than (that is, more recent than) the specified ID.
count (number): Optional; specifies the number of feed items to try and retrieve up to a maximum of 200 per distinct request.
max_id (number): Optional; returns results with an ID less than (that is, older than) or equal to the specified ID.
exclude_replies(boolean): Optional; this parameter will prevent replies from appearing in the returned timeline.
Returns: (JSON) Returns a JSON object containing a list of feed items.
5. Database Design
There are three primary objects: User, Entity (e.g. page, group, etc.), and FeedItem (or Post). Here are some observations about the relationships between these entities:
- A User can follow other entities and can become friends with other users.
- Both users and entities can post FeedItems which can contain text, images, or videos.
- Each FeedItem will have a UserID which will point to the User who created it. For simplicity, let’s assume that only users can create feed items, although, on Facebook Pages can post feed item too.
- Each FeedItem can optionally have an EntityID pointing to the page or the group where that post was created.
If we are using a relational database, we would need to model two relations: User-Entity relation and FeedItem-Media relation. Since each user can be friends with many people and follow a lot of entities, we can store this relation in a separate table. The “Type” column in “UserFollow” identifies if the entity being followed is a User or Entity. Similarly, we can have a table for FeedMedia relation.
6. High Level System Design
At a high level this problem can be divided into two parts:
Feed generation: Newsfeed is generated from the posts (or feed items) of users and entities (pages and groups) that a user follows. So, whenever our system receives a request to generate the feed for a user (say Jane), we will perform the following steps:
- Retrieve IDs of all users and entities that Jane follows.
- Retrieve latest, most popular and relevant posts for those IDs. These are the potential posts that we can show in Jane’s newsfeed.
- Rank these posts based on the relevance to Jane. This represents Jane’s current feed.
- Store this feed in the cache and return top posts (say 20) to be rendered on Jane’s feed.
- On the front-end, when Jane reaches the end of her current feed, she can fetch the next 20 posts from the server and so on.
One thing to notice here is that we generated the feed once and stored it in the cache. What about new incoming posts from people that Jane follows? If Jane is online, we should have a mechanism to rank and add those new posts to her feed. We can periodically (say every five minutes) perform the above steps to rank and add the newer posts to her feed. Jane can then be notified that there are newer items in her feed that she can fetch.
Feed publishing: Whenever Jane loads her newsfeed page, she has to request and pull feed items from the server. When she reaches the end of her current feed, she can pull more data from the server. For newer items either the server can notify Jane and then she can pull, or the server can push, these new posts. We will discuss these options in detail later.
At a high level, we will need following components in our Newsfeed service:
- Web servers: To maintain a connection with the user. This connection will be used to transfer data between the user and the server.
- Application server: To execute the workflows of storing new posts in the database servers. We will also need some application servers to retrieve and to push the newsfeed to the end user.
- Metadata database and cache: To store the metadata about Users, Pages, and Groups.
- Posts database and cache: To store metadata about posts and their contents.
- Video and photo storage, and cache: Blob storage, to store all the media included in the posts.
- Newsfeed generation service: To gather and rank all the relevant posts for a user to generate newsfeed and store in the cache. This service will also receive live updates and will add these newer feed items to any user’s timeline.
- Feed notification service: To notify the user that there are newer items available for their newsfeed.
Following is the high-level architecture diagram of our system. User B and C are following User A.
7. Detailed Component Design
Let’s discuss different components of our system in detail.
a. Feed generation
Let’s take the simple case of the newsfeed generation service fetching most recent posts from all the users and entities that Jane follows; the query would look like this:
(SELECT FeedItemID FROM FeedItem WHERE UserID in ( SELECT EntityOrFriendID FROM UserFollow WHERE UserID = <current_user_id> and type = 0(user)) ) UNION (SELECT FeedItemID FROM FeedItem WHERE EntityID in ( SELECT EntityOrFriendID FROM UserFollow WHERE UserID = <current_user_id> and type = 1(entity)) ) ORDER BY CreationDate DESC LIMIT 100
Here are issues with this design for the feed generation service:
- Crazy slow for users with a lot of friends/follows as we have to perform sorting/merging/ranking of a huge number of posts.
- We generate the timeline when a user loads their page. This would be quite slow and have a high latency.
- For live updates, each status update will result in feed updates for all followers. This could result in high backlogs in our Newsfeed Generation Service.
- For live updates, the server pushing (or notifying about) newer posts to users could lead to very heavy loads, especially for people or pages that have a lot of followers. To improve the efficiency, we can pre-generate the timeline and store it in a memory.
Offline generation for newsfeed: We can have dedicated servers that are continuously generating users’ newsfeed and storing them in memory. So, whenever a user requests for the new posts for their feed, we can simply serve it from the pre-generated, stored location. Using this scheme, user's newsfeed is not compiled on load, but rather on a regular basis and returned to users whenever they request for it.
Whenever these servers need to generate the feed for a user, they will first query to see what was the last time the feed was generated for that user. Then, new feed data would be generated from that time onwards. We can store this data in a hash table where the “key” would be UserID and “value” would be a STRUCT like this:
Struct { LinkedHashMap<FeedItemID, FeedItem> feedItems; DateTime lastGenerated; }
We can store FeedItemIDs in a data structure similar to Linked HashMap or TreeMap, which can allow us to not only jump to any feed item but also iterate through the map easily. Whenever users want to fetch more feed items, they can send the last FeedItemID they currently see in their newsfeed, we can then jump to that FeedItemID in our hash-map and return next batch/page of feed items from there.
How many feed items should we store in memory for a user’s feed? Initially, we can decide to store 500 feed items per user, but this number can be adjusted later based on the usage pattern. For example, if we assume that one page of a user’s feed has 20 posts and most of the users never browse more than ten pages of their feed, we can decide to store only 200 posts per user. For any user who wants to see more posts (more than what is stored in memory), we can always query backend servers.
Should we generate (and keep in memory) newsfeeds for all users? There will be a lot of users that don’t log-in frequently. Here are a few things we can do to handle this; 1) a more straightforward approach could be, to use an LRU based cache that can remove users from memory that haven’t accessed their newsfeed for a long time 2) a smarter solution can figure out the login pattern of users to pre-generate their newsfeed, e.g., at what time of the day a user is active and which days of the week does a user access their newsfeed? etc.
Let’s now discuss some solutions to our “live updates” problems in the following section.
b. Feed publishing
The process of pushing a post to all the followers is called fanout. By analogy, the push approach is called fanout-on-write, while the pull approach is called fanout-on-load.
Let’s discuss different options for publishing feed data to users.
-
"Pull" model or Fan-out-on-load: This method involves keeping all the recent feed data in memory so that users can pull it from the server whenever they need it. Clients can pull the feed data on a regular basis or manually whenever they need it. Possible problems with this approach are a) New data might not be shown to the users until they issue a pull request, b) It’s hard to find the right pull cadence, as most of the time pull requests will result in an empty response if there is no new data, causing waste of resources.
-
"Push" model or Fan-out-on-write: For a push system, once a user has published a post, we can immediately push this post to all the followers. The advantage is that when fetching feed you don’t need to go through your friend's list and get feeds for each of them. It significantly reduces read operations. To efficiently handle this, users have to maintain a Long Poll request with the server for receiving the updates. A possible problem with this approach is that when a user has millions of followers (a celebrity-user) the server has to push updates to a lot of people.
-
Hybrid: An alternate method to handle feed data could be to use a hybrid approach, i.e., to do a combination of fan-out-on-write and fan-out-on-load. Specifically, we can stop pushing posts from users with a high number of followers (a celebrity user) and only push data for those users who have a few hundred (or thousand) followers. For celebrity users, we can let the followers pull the updates. Since the push operation can be extremely costly for users who have a lot of friends or followers, by disabling fanout for them, we can save a huge number of resources. Another alternate approach could be that, once a user publishes a post, we can limit the fanout to only her online friends. Also, to get benefits from both the approaches, a combination of 'push to notify' and 'pull for serving' end-users is a great way to go. Purely a push or pull model is less versatile.
How many feed items can we return to the client in each request? We should have a maximum limit for the number of items a user can fetch in one request (say 20). But, we should let the client specify how many feed items they want with each request as the user may like to fetch a different number of posts depending on the device (mobile vs. desktop).
Should we always notify users if there are new posts available for their newsfeed? It could be useful for users to get notified whenever new data is available. However, on mobile devices, where data usage is relatively expensive, it can consume unnecessary bandwidth. Hence, at least for mobile devices, we can choose not to push data, instead, let users “Pull to Refresh” to get new posts.
8. Feed Ranking
The most straightforward way to rank posts in a newsfeed is by the creation time of the posts, but today’s ranking algorithms are doing a lot more than that to ensure “important” posts are ranked higher. The high-level idea of ranking is first to select key “signals” that make a post important and then to find out how to combine them to calculate a final ranking score.
More specifically, we can select features that are relevant to the importance of any feed item, e.g., number of likes, comments, shares, time of the update, whether the post has images/videos, etc., and then, a score can be calculated using these features. This is generally enough for a simple ranking system. A better ranking system can significantly improve itself by constantly evaluating if we are making progress in user stickiness, retention, ads revenue, etc.
9. Data Partitioning
a. Sharding posts and metadata
Since we have a huge number of new posts every day and our read load is extremely high too, we need to distribute our data onto multiple machines such that we can read/write it efficiently. For sharding our databases that are storing posts and their metadata, we can have a similar design as discussed under 'Designing Twitter'.
b. Sharding feed data
For feed data, which is being stored in memory, we can partition it based on UserID. We can try storing all the data of a user on one server. When storing, we can pass the UserID to our hash function that will map the user to a cache server where we will store the user’s feed objects. Also, for any given user, since we don’t expect to store more than 500 FeedItemIDs, we will not run into a scenario where feed data for a user doesn’t fit on a single server. To get the feed of a user, we would always have to query only one server. For future growth and replication, we must use 'Consistent Hashing'.
Discussion
On This Page