System Design
Learn System Design
Introduction to System Design
How to Learn System Design?
Functional vs. Non-functional Requirements
What are Back-of-the-Envelope Estimations?
Things to Avoid During System Design Interview
System Design Basics
Load Balancing
Introduction to Load Balancing
Load Balancing Algorithms
Uses of Load Balancing
Load Balancer Types
Stateless vs. Stateful Load Balancing
High Availability and Fault Tolerance
Scalability and Performance
Challenges of Load Balancers
Key Characteristics of Distributed Systems
Scalability
Availability
Latency and Performance
Concurrency and Coordination
Monitoring and Observability
Resilience and Error Handling
Fault Tolerance vs. High Availability
Network Essentials
HTTP vs. HTTPS
TCP vs. UDP
HTTP: 1.0 vs. 1.1 vs 2.0 vs. 3.0
URL vs. URI vs. URN
Domain Name System (DNS)
Introduction to DNS
DNS Resolution Process
DNS Load Balancing and High Availability
Caching
Introduction to Caching
Why is Caching Important?
Types of Caching
Cache Replacement Policies
Cache Invalidation
Cache Read Strategies
Cache Coherence and Consistency Models
Caching Challenges
Cache Performance Metrics
CDN
What is CDN?
Origin Server vs. Edge Server
CDN Architecture
Push CDN vs. Pull CDN
Data Partitioning
Introduction to Data Partitioning
Partitioning Methods
Data Sharding Techniques
Benefits of Data Partitioning
Common Problems Associated with Data Partitioning
Proxies
What is a Proxy Server?
Uses of Proxies
VPN vs. Proxy Server
Redundancy and Replication
What is Redundancy?
What is Replication?
Replication Methods
Data Backup vs. Disaster Recovery
CAP & PACELC Theorems
Introduction to CAP Theorem
Components of CAP Theorem
Trade-offs in CAP Theorem
Examples of CAP Theorem in Practice
Beyond CAP Theorem
System Design Trade-offs in Interviews
Databases (SQL vs. NoSQL)
Introduction to Databases
SQL Databases
NoSQL Databases
SQL vs. NoSQL
ACID vs BASE Properties
Real-World Examples and Case Studies
SQL Normalization and Denormalization
In-Memory Database vs. On-Disk Database
Data Replication vs. Data Mirroring
Database Federation
Indexes
What are Indexes?
Types of Indexes
Bloom Filters
Introduction to Bloom Filters
Benefits & Limitations of Bloom Filters
Variants and Extensions of Bloom Filters
Applications of Bloom Filters
Long-Polling vs. WebSockets vs. Server-Sent Events
Difference Between Long-Polling, WebSockets, and Server-Sent Events
Quorum
Why Quorum?
What is Quorum?
Heartbeat
What is Heartbeat?
Checksum
What is Checksum?
Uses of Checksum
Leader and Follower
What is Leader and Follower Pattern?
Security
What is Security and Privacy?
What is Authentication?
What is Authorization?
Authentication vs. Authorization
OAuth vs. JWT for Authentication
What is Encryption?
What are DDoS Attacks?
Distributed Messaging System
Introduction to Messaging System
Introduction to Kafka
Messaging patterns
Popular Messaging Queue Systems
RabbitMQ vs. Kafka vs. ActiveMQ
Scalability and Performance
Distributed File Systems
What is a Distributed File System?
Architecture of a Distributed File System
Key Components of a DFS
Misc Concepts
Batch Processing vs. Stream Processing
XML vs. JSON
Synchronous vs. Asynchronous Communication
Push vs. Pull Notification Systems
Microservices vs. Serverless Architecture
Message Queues vs. Service Bus
Stateful vs. Stateless Architecture
Event-Driven vs. Polling Architecture
Quiz - System Design Fundamentals
Quiz
System Design Trade-offs
Importance of Discussing Trade-offs
Strong vs Eventual Consistency
Latency vs Throughput
ACID vs BASE Properties in Databases
Read-Through vs Write-Through Cache
Batch Processing vs Stream Processing
Load Balancer vs. API Gateway
API Gateway vs Direct Service Exposure
Proxy vs. Reverse Proxy
API Gateway vs. Reverse Proxy
SQL vs. NoSQL
Primary-Replica vs Peer-to-Peer Replication
Data Compression vs Data Deduplication
Server-Side Caching vs Client-Side Caching
REST vs RPC
Polling vs. Long-Polling vs. WebSockets vs. Webhooks
CDN Usage vs Direct Server Serving
Serverless Architecture vs Traditional Server-based
Stateful vs Stateless Architecture
Hybrid Cloud Storage vs All-Cloud Storage
Token Bucket vs Leaky Bucket
Read Heavy vs Write Heavy System
Quiz
System Design Master Template
System Design Interviews - A step by step guide
System Design Master Template
Designing a URL Shortening Service like TinyURL
Designing a URL Shortening Service like TinyURL
Quiz - Designing URL Shortner
Designing Pastebin
Designing Pastebin
Quiz - Designing Pastebin
Designing Instagram
Designing Instagram
Quiz - Designing Instagram
Designing Dropbox
Designing Dropbox
Quiz - Designing Dropbox
Designing Facebook Messenger
Designing Facebook Messenger
Quiz - Designing Facebook Messenger
Designing Twitter
Designing Twitter
Quiz - Designing Twitter
Designing Youtube or Netflix
Designing Youtube or Netflix
Quiz - Designing Youtube
Designing Typeahead Suggestion
Designing Typeahead Suggestion
Quiz - Designing Typeahead Suggestion
Designing an API Rate Limiter
Designing an API Rate Limiter
Quiz - Designing an API Rate Limiter
Designing Twitter Search
Designing Twitter Search
Quiz - Designing Twitter Search
Designing a Web Crawler
Designing a Web Crawler
Quiz - Designing a Web Crawler
Designing Facebook’s Newsfeed
Designing Facebook’s Newsfeed
Quiz - Designing Facebook’s Newsfeed
Designing Yelp or Nearby Friends
Designing Yelp or Nearby Friends
Quiz - Designing Yelp or Nearby Friends
Designing Uber backend
Designing Uber backend
Quiz - Designing Uber backend
Designing Ticketmaster
Designing Ticketmaster
Quiz - Designing Ticketmaster
Dynamo: How to design a key value store?
Dynamo: Introduction
High-Level Architecture
Data Partitioning
Replication
Vector Clocks and Conflicting Data
The Life of Dynamo’s put() & get() Operations
Anti-entropy Through Merkle Trees
Gossip Protocol
Dynamo Characteristics and Criticism
Summary: Dynamo
Quiz: Dynamo
Mock Interview: Dynamo
Designing YouTube Likes Counter (medium)
YouTube Likes Counter
Quiz
Cassandra: How to Design a Wide-column NoSQL Database?
Cassandra: Introduction
High-level Architecture
Replication
Cassandra Consistency Levels
Gossiper
Anatomy of Cassandra's Write Operation
Anatomy of Cassandra's Read Operation
Compaction
Tombstones
Summary: Cassandra
Quiz: Cassandra
Mock Interview: Cassandra
Kafka: How to Design a Distributed Messaging System?
Messaging Systems: Introduction
Kafka: Introduction
High-level Architecture
Kafka: Deep Dive
Consumer Groups
Kafka Workflow
Role of ZooKeeper
Controller Broker
Kafka Delivery Semantics
Kafka Characteristics
Summary: Kafka
Quiz: Kafka
Mock Interview: Kafka
Chubby: How to Design a Distributed Locking Service?
Chubby: Introduction
High-level Architecture
Design Rationale
How Chubby Works
File, Directories, and Handles
Locks, Sequencers, and Lock-delays
Sessions and Events
Master Election and Chubby Events
Caching
Database
Scaling Chubby
Summary: Chubby
Quiz: Chubby
Mock Interview: Chubby
HDFS: How to Design File Storage System?
Hadoop Distributed File System: Introduction
High-level Architecture
Deep Dive
Anatomy of a Read Operation
Anatomy of a Write Operation
Data Integrity & Caching
Fault Tolerance
HDFS High Availability (HA)
HDFS Characteristics
Summary: HDFS
Quiz: HDFS
Mock Interview: HDFS
GFS: How to Design a Distributed File System Storage?
Google File System: Introduction
High-level Architecture
Single Master and Large Chunk Size
Metadata
Master Operations
Anatomy of a Read Operation
Anatomy of a Write Operation
Anatomy of an Append Operation
GFS Consistency Model and Snapshotting
Fault Tolerance, High Availability, and Data Integrity
Garbage Collection
Criticism on GFS
Summary: GFS
Quiz: GFS
Mock Interview: GFS
BigTable: How to Design a Wide Column Storage System?
BigTable: Introduction
BigTable Data Model
System APIs
Partitioning and High-level Architecture
SSTable
GFS and Chubby
Bigtable Components
Working with Tablets
The Life of BigTable's Read & Write Operations
Fault Tolerance and Compaction
BigTable Refinements
BigTable Characteristics
Summary: BigTable
Quiz: BigTable
Mock Interview: BigTable
Designing Reddit (medium)
Design Reddit
Quiz
Designing Notification Service (medium)
Designing a Notification System
Quiz
Design Google Calendar (medium)
Design Google calendar (Medium)
Quiz
Design a Recommendation System (medium)
Design a Recommendation System for Netflix
Quiz
Designing Gmail (medium)
Design Gmail
Quiz
Designing Google News (medium)
Design Google News, a Global News Aggregator System (Medium)
Quiz
Designing Unique ID Generator (medium)
Design Unique ID Generator (Easy)
Quiz
Designing Code Judging System (medium)
Design Code Judging System like LeetCode (Medium)
Quiz
Designing Payment System (hard)
Design Payment System
Quiz
Designing Flash Sale System (hard)
Design a Flash Sale for an E-commerce Site (Hard)
Quiz
Designing Reminder Alert System (hard)
Design a Reminder Alert System
Quiz
System Design Patterns
Introduction: System Design Patterns
1. Bloom Filters
2. Consistent Hashing
3. Quorum
4. Leader and Follower
5. Write-ahead Log
6. Segmented Log
7. High-Water Mark
8. Lease
9. Heartbeat
10. Gossip Protocol
11. Phi Accrual Failure Detection
12. Split Brain
13. Fencing
14. Checksum
15. Vector Clocks
16. CAP Theorem
17. PACELC Theorem
18. Hinted Handoff
19. Read Repair
20. Merkle Trees
Quiz
Usage of API gateway
api gateway
microservices
request routing
security
+2
In modern software architectures, especially those utilizing microservices, there are often many small, independent services that handle specific tasks. Managing communication between these numerous services and the clients can become complex. An API Gateway simplifies this by providing a single entry point for all client requests.
Key Usages of API Gateways
1. Request Routing
Usage: Directing incoming client requests to the appropriate backend service.
Example: Suppose you have an e-commerce application with separate services for user management, product catalog, and order processing. When a client requests product details, the API Gateway routes this request to the product catalog service. If the client wants to place an order, the gateway directs the request to the order processing service.
2. Aggregation of Multiple Services
Usage: Combining responses from multiple backend services into a single response to the client.
Example: A mobile app needs to display user profile information along with recent orders and recommendations. Instead of the client making separate requests to each service, the API Gateway can fetch data from the user service, order service, and recommendation service, then compile and send a unified response to the client.
3. Security Enforcement
Usage: Implementing security measures such as authentication, authorization, and rate limiting.
Example: Before a request reaches any backend service, the API Gateway can verify the user's authentication token to ensure they are logged in. It can also check if the user has the necessary permissions to access certain data and limit the number of requests from a single user to prevent abuse.
4. Load Balancing
Usage: Distributing incoming requests evenly across multiple instances of backend services to ensure no single service becomes a bottleneck.
Example: If your application experiences high traffic, the API Gateway can distribute incoming requests for the product catalog service across several server instances, ensuring efficient use of resources and maintaining performance.
5. Caching Responses
Usage: Storing frequently requested data to reduce latency and decrease the load on backend services.
Example: If the product catalog doesn't change frequently, the API Gateway can cache product information. When a client requests product details, the gateway can serve the cached data instead of querying the product catalog service every time, resulting in faster response times.
6. Protocol Translation
Usage: Converting requests and responses between different protocols used by clients and backend services.
Example: A client might send requests over HTTP/HTTPS, while some backend services communicate using WebSockets or gRPC. The API Gateway can handle the necessary protocol conversions, allowing seamless communication between clients and services.
7. Monitoring and Logging
Usage: Tracking and recording request and response data for analysis, debugging, and performance monitoring.
Example: The API Gateway can log all incoming requests, including details like request paths, response times, and error rates. This information is invaluable for identifying performance issues, understanding usage patterns, and troubleshooting problems.
8. Transformation of Requests and Responses
Usage: Modifying the data format or structure of requests and responses to meet the needs of clients or services.
Example: Suppose a client expects data in JSON format, but a backend service provides data in XML. The API Gateway can transform the XML response into JSON before sending it to the client, ensuring compatibility without requiring changes to the backend service.
9. API Versioning
Usage: Managing different versions of APIs to ensure backward compatibility and smooth transitions when updates are made.
Example: Imagine you have a mobile app that relies on your backend services. When you update the API to add new features or make changes, older versions of the app might still need to interact with the previous API version. The API Gateway can route requests to different backend service versions based on the API version specified in the request, ensuring that both old and new clients operate seamlessly without disruption.
10. Rate Limiting and Throttling
Usage: Controlling the number of requests a client can make in a given time frame to protect backend services from being overwhelmed.
Example: Suppose your API is publicly accessible and you want to prevent any single user from making too many requests in a short period, which could degrade performance for others. The API Gateway can enforce rate limits, such as allowing a maximum of 100 requests per minute per user. If a user exceeds this limit, the gateway can temporarily block further requests, ensuring fair usage and maintaining service stability.
11. API Monetization
Usage: Enabling businesses to monetize their APIs by controlling access, usage tiers, and billing.
Example: A company provides a public API for accessing weather data. Using an API Gateway, they can create different subscription tiers (e.g., free, basic, premium) with varying levels of access and usage limits. The gateway can handle authentication, track usage based on subscription plans, and integrate with billing systems to charge users accordingly. This setup allows the company to generate revenue from their API offerings effectively.
12. Service Discovery Integration
Usage: Facilitating dynamic discovery of backend services, especially in environments where services are frequently scaled up or down.
Example: In a microservices environment using Kubernetes, services can scale dynamically based on demand. The API Gateway can integrate with a service discovery tool (like Consul or Eureka) to automatically route requests to the appropriate service instances, even as they change. This ensures that clients always connect to available and healthy service instances without manual configuration.
13. Circuit Breaker Pattern Implementation
Usage: Preventing cascading failures by detecting when a backend service is failing and stopping requests to it temporarily.
Example: If your order processing service is experiencing issues and becomes unresponsive, the API Gateway can detect the failure pattern and activate a circuit breaker. This means the gateway will stop sending new requests to the problematic service for a specified period, allowing it time to recover. During this time, the gateway can return fallback responses to clients, maintaining overall system stability.
14. Content-Based Routing
Usage: Routing requests to different backend services based on the content of the request, such as headers, body, or query parameters.
Example: Consider an API that handles different types of media uploads (images, videos, documents). The API Gateway can inspect the Content-Type header of incoming requests and route them to specialized backend services optimized for handling each media type. This ensures that each type of content is processed efficiently by the appropriate service.
15. SSL Termination
Usage: Handling SSL/TLS encryption and decryption at the gateway level to offload this resource-intensive task from backend services.
Example: Instead of each backend service managing its own SSL certificates and handling encryption, the API Gateway can terminate SSL connections. Clients communicate securely with the gateway over HTTPS, and the gateway forwards requests to backend services over HTTP or a secure internal network. This simplifies certificate management and reduces the computational load on backend services.
16. Policy Enforcement
Usage: Applying organizational policies consistently across all API traffic, such as data validation, request formatting, and access controls.
Example: Your organization might have policies requiring that all incoming data be validated for specific fields or that certain headers are present in requests. The API Gateway can enforce these policies by validating incoming requests before they reach backend services. If a request doesn't comply, the gateway can reject it with an appropriate error message, ensuring that only well-formed and authorized requests are processed.
17. Multi-Tenancy Support
Usage: Supporting multiple clients or tenants within a single API infrastructure while ensuring data isolation and customized configurations.
Example: A SaaS platform serves multiple businesses, each considered a tenant. The API Gateway can distinguish between tenants based on headers or authentication tokens and route requests to tenant-specific services or databases. It can also apply tenant-specific rate limits, logging, and security policies, ensuring that each tenant operates in a secure and isolated environment.
18. A/B Testing and Canary Releases
Usage: Facilitating controlled testing of new features or services by directing a subset of traffic to different backend versions.
Example: When deploying a new version of the user recommendation service, you might want to test its performance and impact on user experience without affecting all users. The API Gateway can route a small percentage of requests to the new version (canary release) while the majority continue using the stable version. This approach allows you to monitor the new service's behavior and roll it out more broadly once it's proven reliable.
19. Localization and Internationalization Support
Usage: Adapting responses based on the client's locale, such as language preferences or regional settings.
Example: If your application serves users in different countries, the API Gateway can detect the user's locale from request headers or parameters and modify responses accordingly. For instance, it can format dates, numbers, or currencies to match the user's regional standards or serve localized content by fetching data from region-specific backend services.
20. Reducing Client Complexity
Usage: Simplifying the client-side logic by handling complex operations on the server side through the gateway.
Example: A client application might need to perform multiple operations to complete a user registration process, such as creating a user account, sending a welcome email, and logging the registration event. Instead of the client making separate API calls for each operation, the API Gateway can expose a single endpoint that orchestrates these actions behind the scenes. This reduces the complexity of the client code and minimizes the number of network requests.
Real-World Example: Netflix
Netflix is a prime example of a company that leverages API Gateways extensively. With a vast array of microservices handling everything from user profiles to streaming content, Netflix uses its API Gateway to manage and streamline interactions between its clients (like smart TVs, mobile apps) and the multitude of backend services. This setup ensures scalability, reliability, and efficient handling of massive amounts of traffic.
Conclusion
An API Gateway is a powerful component in modern software architectures, especially those utilizing microservices. It simplifies client interactions, enhances security, improves performance, and provides a centralized point for managing various aspects of client-server communication.
Discussion
On This Page