System Design
Learn System Design
Introduction to System Design
How to Learn System Design?
Functional vs. Non-functional Requirements
What are Back-of-the-Envelope Estimations?
Things to Avoid During System Design Interview
System Design Basics
API Gateway
Introduction to API Gateway
Usage of API gateway
Advantages and disadvantages of using API gateway
Key Characteristics of Distributed Systems
Scalability
Availability
Latency and Performance
Concurrency and Coordination
Monitoring and Observability
Resilience and Error Handling
Fault Tolerance vs. High Availability
Network Essentials
HTTP vs. HTTPS
TCP vs. UDP
HTTP: 1.0 vs. 1.1 vs 2.0 vs. 3.0
URL vs. URI vs. URN
Domain Name System (DNS)
Introduction to DNS
DNS Resolution Process
DNS Load Balancing and High Availability
Caching
Introduction to Caching
Why is Caching Important?
Types of Caching
Cache Replacement Policies
Cache Invalidation
Cache Read Strategies
Cache Coherence and Consistency Models
Caching Challenges
Cache Performance Metrics
CDN
What is CDN?
Origin Server vs. Edge Server
CDN Architecture
Push CDN vs. Pull CDN
Data Partitioning
Introduction to Data Partitioning
Partitioning Methods
Data Sharding Techniques
Benefits of Data Partitioning
Common Problems Associated with Data Partitioning
Proxies
What is a Proxy Server?
Uses of Proxies
VPN vs. Proxy Server
Redundancy and Replication
What is Redundancy?
What is Replication?
Replication Methods
Data Backup vs. Disaster Recovery
CAP & PACELC Theorems
Introduction to CAP Theorem
Components of CAP Theorem
Trade-offs in CAP Theorem
Examples of CAP Theorem in Practice
Beyond CAP Theorem
System Design Trade-offs in Interviews
Databases (SQL vs. NoSQL)
Introduction to Databases
SQL Databases
NoSQL Databases
SQL vs. NoSQL
ACID vs BASE Properties
Real-World Examples and Case Studies
SQL Normalization and Denormalization
In-Memory Database vs. On-Disk Database
Data Replication vs. Data Mirroring
Database Federation
Indexes
What are Indexes?
Types of Indexes
Bloom Filters
Introduction to Bloom Filters
Benefits & Limitations of Bloom Filters
Variants and Extensions of Bloom Filters
Applications of Bloom Filters
Long-Polling vs. WebSockets vs. Server-Sent Events
Difference Between Long-Polling, WebSockets, and Server-Sent Events
Quorum
Why Quorum?
What is Quorum?
Heartbeat
What is Heartbeat?
Checksum
What is Checksum?
Uses of Checksum
Leader and Follower
What is Leader and Follower Pattern?
Security
What is Security and Privacy?
What is Authentication?
What is Authorization?
Authentication vs. Authorization
OAuth vs. JWT for Authentication
What is Encryption?
What are DDoS Attacks?
Distributed Messaging System
Introduction to Messaging System
Introduction to Kafka
Messaging patterns
Popular Messaging Queue Systems
RabbitMQ vs. Kafka vs. ActiveMQ
Scalability and Performance
Distributed File Systems
What is a Distributed File System?
Architecture of a Distributed File System
Key Components of a DFS
Misc Concepts
Batch Processing vs. Stream Processing
XML vs. JSON
Synchronous vs. Asynchronous Communication
Push vs. Pull Notification Systems
Microservices vs. Serverless Architecture
Message Queues vs. Service Bus
Stateful vs. Stateless Architecture
Event-Driven vs. Polling Architecture
Quiz - System Design Fundamentals
Quiz
System Design Trade-offs
Importance of Discussing Trade-offs
Strong vs Eventual Consistency
Latency vs Throughput
ACID vs BASE Properties in Databases
Read-Through vs Write-Through Cache
Batch Processing vs Stream Processing
Load Balancer vs. API Gateway
API Gateway vs Direct Service Exposure
Proxy vs. Reverse Proxy
API Gateway vs. Reverse Proxy
SQL vs. NoSQL
Primary-Replica vs Peer-to-Peer Replication
Data Compression vs Data Deduplication
Server-Side Caching vs Client-Side Caching
REST vs RPC
Polling vs. Long-Polling vs. WebSockets vs. Webhooks
CDN Usage vs Direct Server Serving
Serverless Architecture vs Traditional Server-based
Stateful vs Stateless Architecture
Hybrid Cloud Storage vs All-Cloud Storage
Token Bucket vs Leaky Bucket
Read Heavy vs Write Heavy System
Quiz
System Design Master Template
System Design Interviews - A step by step guide
System Design Master Template
Designing a URL Shortening Service like TinyURL
Designing a URL Shortening Service like TinyURL
Quiz - Designing URL Shortner
Designing Pastebin
Designing Pastebin
Quiz - Designing Pastebin
Designing Instagram
Designing Instagram
Quiz - Designing Instagram
Designing Dropbox
Designing Dropbox
Quiz - Designing Dropbox
Designing Facebook Messenger
Designing Facebook Messenger
Quiz - Designing Facebook Messenger
Designing Twitter
Designing Twitter
Quiz - Designing Twitter
Designing Youtube or Netflix
Designing Youtube or Netflix
Quiz - Designing Youtube
Designing Typeahead Suggestion
Designing Typeahead Suggestion
Quiz - Designing Typeahead Suggestion
Designing an API Rate Limiter
Designing an API Rate Limiter
Quiz - Designing an API Rate Limiter
Designing Twitter Search
Designing Twitter Search
Quiz - Designing Twitter Search
Designing a Web Crawler
Designing a Web Crawler
Quiz - Designing a Web Crawler
Designing Facebook’s Newsfeed
Designing Facebook’s Newsfeed
Quiz - Designing Facebook’s Newsfeed
Designing Yelp or Nearby Friends
Designing Yelp or Nearby Friends
Quiz - Designing Yelp or Nearby Friends
Designing Uber backend
Designing Uber backend
Quiz - Designing Uber backend
Designing Ticketmaster
Designing Ticketmaster
Quiz - Designing Ticketmaster
Dynamo: How to design a key value store?
Dynamo: Introduction
High-Level Architecture
Data Partitioning
Replication
Vector Clocks and Conflicting Data
The Life of Dynamo’s put() & get() Operations
Anti-entropy Through Merkle Trees
Gossip Protocol
Dynamo Characteristics and Criticism
Summary: Dynamo
Quiz: Dynamo
Mock Interview: Dynamo
Designing YouTube Likes Counter (medium)
YouTube Likes Counter
Quiz
Cassandra: How to Design a Wide-column NoSQL Database?
Cassandra: Introduction
High-level Architecture
Replication
Cassandra Consistency Levels
Gossiper
Anatomy of Cassandra's Write Operation
Anatomy of Cassandra's Read Operation
Compaction
Tombstones
Summary: Cassandra
Quiz: Cassandra
Mock Interview: Cassandra
Kafka: How to Design a Distributed Messaging System?
Messaging Systems: Introduction
Kafka: Introduction
High-level Architecture
Kafka: Deep Dive
Consumer Groups
Kafka Workflow
Role of ZooKeeper
Controller Broker
Kafka Delivery Semantics
Kafka Characteristics
Summary: Kafka
Quiz: Kafka
Mock Interview: Kafka
Chubby: How to Design a Distributed Locking Service?
Chubby: Introduction
High-level Architecture
Design Rationale
How Chubby Works
File, Directories, and Handles
Locks, Sequencers, and Lock-delays
Sessions and Events
Master Election and Chubby Events
Caching
Database
Scaling Chubby
Summary: Chubby
Quiz: Chubby
Mock Interview: Chubby
HDFS: How to Design File Storage System?
Hadoop Distributed File System: Introduction
High-level Architecture
Deep Dive
Anatomy of a Read Operation
Anatomy of a Write Operation
Data Integrity & Caching
Fault Tolerance
HDFS High Availability (HA)
HDFS Characteristics
Summary: HDFS
Quiz: HDFS
Mock Interview: HDFS
GFS: How to Design a Distributed File System Storage?
Google File System: Introduction
High-level Architecture
Single Master and Large Chunk Size
Metadata
Master Operations
Anatomy of a Read Operation
Anatomy of a Write Operation
Anatomy of an Append Operation
GFS Consistency Model and Snapshotting
Fault Tolerance, High Availability, and Data Integrity
Garbage Collection
Criticism on GFS
Summary: GFS
Quiz: GFS
Mock Interview: GFS
BigTable: How to Design a Wide Column Storage System?
BigTable: Introduction
BigTable Data Model
System APIs
Partitioning and High-level Architecture
SSTable
GFS and Chubby
Bigtable Components
Working with Tablets
The Life of BigTable's Read & Write Operations
Fault Tolerance and Compaction
BigTable Refinements
BigTable Characteristics
Summary: BigTable
Quiz: BigTable
Mock Interview: BigTable
Designing Reddit (medium)
Design Reddit
Quiz
Designing Notification Service (medium)
Designing a Notification System
Quiz
Design Google Calendar (medium)
Design Google calendar (Medium)
Quiz
Design a Recommendation System (medium)
Design a Recommendation System for Netflix
Quiz
Designing Gmail (medium)
Design Gmail
Quiz
Designing Google News (medium)
Design Google News, a Global News Aggregator System (Medium)
Quiz
Designing Unique ID Generator (medium)
Design Unique ID Generator (Easy)
Quiz
Designing Code Judging System (medium)
Design Code Judging System like LeetCode (Medium)
Quiz
Designing Payment System (hard)
Design Payment System
Quiz
Designing Flash Sale System (hard)
Design a Flash Sale for an E-commerce Site (Hard)
Quiz
Designing Reminder Alert System (hard)
Design a Reminder Alert System
Quiz
System Design Patterns
Introduction: System Design Patterns
1. Bloom Filters
2. Consistent Hashing
3. Quorum
4. Leader and Follower
5. Write-ahead Log
6. Segmented Log
7. High-Water Mark
8. Lease
9. Heartbeat
10. Gossip Protocol
11. Phi Accrual Failure Detection
12. Split Brain
13. Fencing
14. Checksum
15. Vector Clocks
16. CAP Theorem
17. PACELC Theorem
18. Hinted Handoff
19. Read Repair
20. Merkle Trees
Quiz
Load Balancing Algorithms
load balancing algorithms
round robin
least connections
ip hash
+2
A load balancing algorithm is a method used by a load balancer to distribute incoming traffic and requests among multiple servers or resources. The primary purpose of a load balancing algorithm is to ensure efficient utilization of available resources, improve overall system performance, and maintain high availability and reliability.
Load balancing algorithms help to prevent any single server or resource from becoming overwhelmed, which could lead to performance degradation or failure. By distributing the workload, load balancing algorithms can optimize response times, maximize throughput, and enhance user experience. These algorithms can consider factors such as server capacity, active connections, response times, and server health, among others, to make informed decisions on how to best distribute incoming requests.
Here are the most famous load balancing algorithms:
1. Round Robin
This algorithm distributes incoming requests to servers in a cyclic order. It assigns a request to the first server, then moves to the second, third, and so on, and after reaching the last server, it starts again at the first.
Pros:
- Ensures an equal distribution of requests among the servers, as each server gets a turn in a fixed order.
- Easy to implement and understand.
- Works well when servers have similar capacities.
Cons:
- No Load Awareness: Does not take into account the current load or capacity of each server. All servers are treated equally regardless of their current state.
- No Session Affinity: Subsequent requests from the same client may be directed to different servers, which can be problematic for stateful applications.
- Performance Issues with Different Capacities: May not perform optimally when servers have different capacities or varying workloads.
- Predictable Distribution Pattern: Round Robin is predictable in its request distribution pattern, which could potentially be exploited by attackers who can observe traffic patterns and might find vulnerabilities in specific servers by predicting which server will handle their requests.
Use Cases
- Homogeneous Environments: Suitable for environments where all servers have similar capacity and performance.
- Stateless Applications: Works well for stateless applications where each request can be handled independently.
2. Least Connections
The Least Connections algorithm is a dynamic load balancing technique that assigns incoming requests to the server with the fewest active connections at the time of the request. This method ensures a more balanced distribution of load across servers, especially in environments where traffic patterns are unpredictable and request processing times vary.
Pros:
- Load Awareness: Takes into account the current load on each server by considering the number of active connections, leading to better utilization of server resources.
- Dynamic Distribution: Adapts to changing traffic patterns and server loads, ensuring no single server becomes a bottleneck.
- Efficiency in Heterogeneous Environments: Performs well when servers have varying capacities and workloads, as it dynamically allocates requests to less busy servers.
Cons:
- Higher Complexity: More complex to implement compared to simpler algorithms like Round Robin, as it requires real-time monitoring of active connections.
- State Maintenance: Requires the load balancer to maintain the state of active connections, which can increase overhead.
- Potential for Connection Spikes: In scenarios where connection duration is short, servers can experience rapid spikes in connection counts, leading to frequent rebalancing.
Use Cases
- Heterogeneous Environments: Suitable for environments where servers have different capacities and workloads, and the load needs to be dynamically distributed.
- Variable Traffic Patterns: Works well for applications with unpredictable or highly variable traffic patterns, ensuring that no single server is overwhelmed.
- Stateful Applications: Effective for applications where maintaining session state is important, as it helps distribute active sessions more evenly.
Comparison to Round Robin
- Round Robin: Distributes requests in a fixed, cyclic order without considering the current load on each server.
- Least Connections: Distributes requests based on the current load, directing new requests to the server with the fewest active connections.
3. Weighted Round Robin
Weighted Round Robin (WRR) is an enhanced version of the Round Robin load balancing algorithm. It assigns weights to each server based on their capacity or performance, distributing incoming requests proportionally according to these weights. This ensures that more powerful servers handle a larger share of the load, while less powerful servers handle a smaller share.
Pros
- Load Distribution According to Capacity: Servers with higher capacities handle more requests, leading to better utilization of resources.
- Flexibility: Easily adjustable to accommodate changes in server capacities or additions of new servers.
- Improved Performance: Helps in optimizing overall system performance by preventing overloading of less powerful servers.
Cons
- Complexity in Weight Assignment: Determining appropriate weights for each server can be challenging and requires accurate performance metrics.
- Increased Overhead: Managing and updating weights can introduce additional overhead, especially in dynamic environments where server performance fluctuates.
- Not Ideal for Highly Variable Loads: In environments with highly variable load patterns, WRR may not always provide optimal load balancing as it doesn't consider real-time server load.
Use Cases
- Heterogeneous Server Environments: Ideal for environments where servers have different processing capabilities, ensuring efficient use of resources.
- Scalable Web Applications: Suitable for web applications where different servers may have varying performance characteristics.
- Database Clusters: Useful in database clusters where some nodes have higher processing power and can handle more queries.
4. Weighted Least Connections
Weighted Least Connections is an advanced load balancing algorithm that combines the principles of the Least Connections and Weighted Round Robin algorithms. It takes into account both the current load (number of active connections) on each server and the relative capacity of each server (weight). This approach ensures that more powerful servers handle a proportionally larger share of the load, while also dynamically adjusting to the real-time load on each server.
Pros
- Dynamic Load Balancing: Adjusts to the real-time load on each server, ensuring a more balanced distribution of requests.
- Capacity Awareness: Takes into account the relative capacity of each server, leading to better utilization of resources.
- Flexibility: Can handle environments with heterogeneous servers and variable load patterns effectively.
Cons
- Complexity: More complex to implement compared to simpler algorithms like Round Robin and Least Connections.
- State Maintenance: Requires the load balancer to keep track of both active connections and server weights, increasing overhead.
- Weight Assignment: Determining appropriate weights for each server can be challenging and requires accurate performance metrics.
Use Cases
- Heterogeneous Server Environments: Ideal for environments where servers have different processing capacities and workloads.
- High Traffic Web Applications: Suitable for web applications with variable traffic patterns, ensuring no single server becomes a bottleneck.
- Database Clusters: Useful in database clusters where nodes have varying performance capabilities and query loads.
5. IP Hash
IP Hash load balancing is a technique that assigns client requests to servers based on the client's IP address. The load balancer uses a hash function to convert the client's IP address into a hash value, which is then used to determine which server should handle the request. This method ensures that requests from the same client IP address are consistently routed to the same server, providing session persistence.
Example
Suppose you have three servers (Server A, Server B, and Server C) and a client with the IP address 192.168.1.10. The load balancer applies a hash function to this IP address, resulting in a hash value. If the hash value is 2 and there are three servers, the load balancer routes the request to Server C (2 % 3 = 2).
Pros
- Session Persistence: Ensures that requests from the same client IP address are consistently routed to the same server, which is beneficial for stateful applications.
- Simplicity: Easy to implement and does not require the load balancer to maintain the state of connections.
- Deterministic: Predictable and consistent routing based on the client's IP address.
Cons
- Uneven Distribution: If client IP addresses are not evenly distributed, some servers may receive more requests than others, leading to an uneven load.
- Dynamic Changes: Adding or removing servers can disrupt the hash mapping, causing some clients to be routed to different servers.
- Limited Flexibility: Does not take into account the current load or capacity of servers, which can lead to inefficiencies.
Use Cases
- Stateful Applications: Ideal for applications where maintaining session persistence is important, such as online shopping carts or user sessions.
- Geographically Distributed Clients: Useful when clients are distributed across different regions and consistent routing is required.
6. Least Response Time
Least Response Time load balancing is a dynamic algorithm that assigns incoming requests to the server with the lowest response time, ensuring efficient utilization of server resources and optimal client experience. This approach aims to direct traffic to the server that can handle the request the fastest, based on recent performance metrics.
How Least Response Time Load Balancing Works
- Monitor Response Times: The load balancer continuously monitors the response times of each server. Response time is typically measured from when a request is sent to a server until a response is received.
- Assign Requests: When a new request arrives, the load balancer assigns it to the server with the lowest average response time.
- Dynamic Adjustment: The load balancer dynamically adjusts the assignment of requests based on real-time performance data, ensuring that the fastest server handles the next request.
Pros
- Optimized Performance: Ensures that requests are handled by the fastest available server, leading to reduced latency and improved client experience.
- Dynamic Load Balancing: Continuously adjusts to changing server performance, ensuring optimal distribution of load.
- Effective Resource Utilization: Helps in better utilization of server resources by directing traffic to servers that can respond quickly.
Cons
- Complexity: More complex to implement compared to simpler algorithms like Round Robin, as it requires continuous monitoring of server performance.
- Overhead: Monitoring response times and dynamically adjusting the load can introduce additional overhead.
- Short-Term Variability: Response times can vary in the short term due to network fluctuations or transient server issues, potentially causing frequent rebalancing.
Use Cases
- Real-Time Applications: Ideal for applications where low latency and fast response times are critical, such as online gaming, video streaming, or financial trading platforms.
- Web Services: Useful for web services and APIs that need to provide quick responses to user requests.
- Dynamic Environments: Suitable for environments with fluctuating loads and varying server performance.
7. Random
Random load balancing is a simple algorithm that distributes incoming requests to servers randomly. Instead of following a fixed sequence or using performance metrics, the load balancer selects a server at random to handle each request. This method can be effective in scenarios where the load is relatively uniform and servers have similar capacities.
Suppose you have three servers: Server A, Server B, and Server C. When a new request arrives, the load balancer randomly chooses one of these servers to handle the request. Over time, if the randomness is uniform, each server should receive approximately the same number of requests.
Pros
- Simplicity: Very easy to implement and understand, requiring minimal configuration.
- No State Maintenance: The load balancer does not need to track the state or performance of servers, reducing overhead.
- Uniform Distribution Over Time: If the random selection is uniform, the load will be evenly distributed across servers over a long period.
Cons
- No Load Awareness: Does not consider the current load or capacity of servers, which can lead to uneven distribution if server performance varies.
- Potential for Imbalance: In the short term, random selection can lead to an uneven distribution of requests.
- No Session Affinity: Requests from the same client may be directed to different servers, which can be problematic for stateful applications.
- Security systems that rely on detecting anomalies (e.g., to mitigate DDoS attacks) might find it slightly more challenging to identify malicious patterns if a Random algorithm is used, due to the inherent unpredictability in request distribution. This could potentially dilute the visibility of attack patterns.
Use Cases
- Homogeneous Environments: Suitable for environments where servers have similar capacity and performance.
- Stateless Applications: Works well for stateless applications where each request can be handled independently.
- Simple Deployments: Ideal for simple deployments where the complexity of other load balancing algorithms is not justified.
8. Least Bandwidth
The Least Bandwidth load balancing algorithm distributes incoming requests to servers based on the current bandwidth usage. It routes each new request to the server that is consuming the least amount of bandwidth at the time. This approach helps in balancing the network load more efficiently by ensuring that no single server gets overwhelmed with too much data traffic.
Pros
- Dynamic Load Balancing: Continuously adjusts to the current network load, ensuring optimal distribution of traffic.
- Prevents Overloading: Helps in preventing any single server from being overwhelmed with too much data traffic, leading to better performance and stability.
- Efficient Resource Utilization: Ensures that all servers are utilized more effectively by balancing the bandwidth usage.
Cons
- Complexity: More complex to implement compared to simpler algorithms like Round Robin, as it requires continuous monitoring of bandwidth usage.
- Overhead: Monitoring bandwidth and dynamically adjusting the load can introduce additional overhead.
- Short-Term Variability: Bandwidth usage can fluctuate in the short term, potentially causing frequent rebalancing.
Use Cases
- High Bandwidth Applications: Ideal for applications with high bandwidth usage, such as video streaming, file downloads, and large data transfers.
- Content Delivery Networks (CDNs): Useful for CDNs that need to balance traffic efficiently to deliver content quickly.
- Real-Time Applications: Suitable for real-time applications where maintaining low latency is critical.
9. Custom Load
Custom Load load balancing is a flexible and highly configurable approach that allows you to define your own metrics and rules for distributing incoming traffic across a pool of servers. Unlike standard load balancing algorithms that use predefined criteria such as connection count or response time, Custom Load load balancing enables you to tailor the distribution strategy based on specific requirements and conditions unique to your application or infrastructure.
How Custom Load Load Balancing Works
-
Define Custom Metrics: Determine the metrics that best represent the load or performance characteristics relevant to your application. These metrics can include CPU usage, memory usage, disk I/O, application-specific metrics, or a combination of several metrics.
-
Implement Monitoring: Continuously monitor the defined metrics on each server in the pool. This may involve integrating with monitoring tools or custom scripts that collect and report the necessary data.
-
Create Load Balancing Rules: Establish rules and algorithms that use the monitored metrics to make load balancing decisions. This can be a simple weighted sum of metrics or more complex logic that prioritizes certain metrics over others.
-
Dynamic Adjustment: Use the collected data and rules to dynamically adjust the distribution of incoming requests, ensuring that the traffic is balanced according to the custom load criteria.
Pros
- Flexibility: Allows for highly customized load balancing strategies tailored to the specific needs and performance characteristics of your application.
- Optimized Resource Utilization: Can lead to more efficient use of server resources by considering a comprehensive set of metrics.
- Adaptability: Easily adaptable to changing conditions and requirements, making it suitable for complex and dynamic environments.
Cons
- Complexity: More complex to implement and configure compared to standard load balancing algorithms.
- Monitoring Overhead: Requires continuous monitoring of multiple metrics, which can introduce additional overhead.
- Potential for Misconfiguration: Incorrectly defined metrics or rules can lead to suboptimal load balancing and performance issues.
Use Cases
- Complex Applications: Ideal for applications with complex performance characteristics and varying resource requirements.
- Highly Dynamic Environments: Suitable for environments where workloads and server performance can change rapidly and unpredictably.
- Custom Requirements: Useful when standard load balancing algorithms do not meet the specific needs of the application.
Discussion
On This Page