System Design

Learn System Design

Introduction to System Design

How to Learn System Design?

Functional vs. Non-functional Requirements

What are Back-of-the-Envelope Estimations?

Things to Avoid During System Design Interview

System Design Basics

Introduction to Load Balancing

Load Balancing Algorithms

Uses of Load Balancing

Load Balancer Types

Stateless vs. Stateful Load Balancing

High Availability and Fault Tolerance

Scalability and Performance

Challenges of Load Balancers

Introduction to API Gateway

Usage of API gateway

Advantages and disadvantages of using API gateway

Scalability

Availability

Latency and Performance

Concurrency and Coordination

Monitoring and Observability

Resilience and Error Handling

Fault Tolerance vs. High Availability

HTTP vs. HTTPS

TCP vs. UDP

HTTP: 1.0 vs. 1.1 vs 2.0 vs. 3.0

URL vs. URI vs. URN

Introduction to DNS

DNS Resolution Process

DNS Load Balancing and High Availability

Introduction to Caching

Why is Caching Important?

Types of Caching

Cache Replacement Policies

Cache Invalidation

Cache Read Strategies

Cache Coherence and Consistency Models

Caching Challenges

Cache Performance Metrics

What is CDN?

Origin Server vs. Edge Server

CDN Architecture

Push CDN vs. Pull CDN

Introduction to Data Partitioning

Partitioning Methods

Data Sharding Techniques

Benefits of Data Partitioning

Common Problems Associated with Data Partitioning

What is a Proxy Server?

Uses of Proxies

VPN vs. Proxy Server

What is Redundancy?

What is Replication?

Replication Methods

Data Backup vs. Disaster Recovery

Introduction to CAP Theorem

Components of CAP Theorem

Trade-offs in CAP Theorem

Examples of CAP Theorem in Practice

Beyond CAP Theorem

System Design Trade-offs in Interviews

Introduction to Databases

SQL Databases

NoSQL Databases

SQL vs. NoSQL

ACID vs BASE Properties

Real-World Examples and Case Studies

SQL Normalization and Denormalization

In-Memory Database vs. On-Disk Database

Data Replication vs. Data Mirroring

Database Federation

What are Indexes?

Types of Indexes

Introduction to Bloom Filters

Benefits & Limitations of Bloom Filters

Variants and Extensions of Bloom Filters

Applications of Bloom Filters

Difference Between Long-Polling, WebSockets, and Server-Sent Events

Why Quorum?

What is Quorum?

What is Heartbeat?

What is Checksum?

Uses of Checksum

What is Leader and Follower Pattern?

What is Security and Privacy?

What is Authentication?

What is Authorization?

Authentication vs. Authorization

OAuth vs. JWT for Authentication

What is Encryption?

What are DDoS Attacks?

Introduction to Messaging System

Introduction to Kafka

Messaging patterns

Popular Messaging Queue Systems

RabbitMQ vs. Kafka vs. ActiveMQ

Scalability and Performance

What is a Distributed File System?

Architecture of a Distributed File System

Key Components of a DFS

Batch Processing vs. Stream Processing

XML vs. JSON

Synchronous vs. Asynchronous Communication

Push vs. Pull Notification Systems

Microservices vs. Serverless Architecture

Message Queues vs. Service Bus

Stateful vs. Stateless Architecture

Event-Driven vs. Polling Architecture

Quiz

Importance of Discussing Trade-offs

Strong vs Eventual Consistency

Latency vs Throughput

ACID vs BASE Properties in Databases

Read-Through vs Write-Through Cache

Batch Processing vs Stream Processing

Load Balancer vs. API Gateway

API Gateway vs Direct Service Exposure

Proxy vs. Reverse Proxy

API Gateway vs. Reverse Proxy

SQL vs. NoSQL

Primary-Replica vs Peer-to-Peer Replication

Data Compression vs Data Deduplication

Server-Side Caching vs Client-Side Caching

REST vs RPC

Polling vs. Long-Polling vs. WebSockets vs. Webhooks

CDN Usage vs Direct Server Serving

Serverless Architecture vs Traditional Server-based

Stateful vs Stateless Architecture

Hybrid Cloud Storage vs All-Cloud Storage

Token Bucket vs Leaky Bucket

Read Heavy vs Write Heavy System

Quiz

System Design Interviews - A step by step guide

System Design Master Template

Designing a URL Shortening Service like TinyURL

Quiz - Designing URL Shortner

Designing Pastebin

Quiz - Designing Pastebin

Designing Instagram

Quiz - Designing Instagram

Designing Dropbox

Quiz - Designing Dropbox

Designing Facebook Messenger

Quiz - Designing Facebook Messenger

Designing Twitter

Quiz - Designing Twitter

Designing Youtube or Netflix

Quiz - Designing Youtube

Designing Typeahead Suggestion

Quiz - Designing Typeahead Suggestion

Designing an API Rate Limiter

Quiz - Designing an API Rate Limiter

Designing Twitter Search

Quiz - Designing Twitter Search

Designing a Web Crawler

Quiz - Designing a Web Crawler

Designing Facebook’s Newsfeed

Quiz - Designing Facebook’s Newsfeed

Designing Yelp or Nearby Friends

Quiz - Designing Yelp or Nearby Friends

Designing Uber backend

Quiz - Designing Uber backend

Designing Ticketmaster

Quiz - Designing Ticketmaster

Dynamo: Introduction

High-Level Architecture

Data Partitioning

Replication

Vector Clocks and Conflicting Data

The Life of Dynamo’s put() & get() Operations

Anti-entropy Through Merkle Trees

Gossip Protocol

Dynamo Characteristics and Criticism

Summary: Dynamo

Quiz: Dynamo

Mock Interview: Dynamo

YouTube Likes Counter

Quiz

Cassandra: Introduction

High-level Architecture

Replication

Cassandra Consistency Levels

Gossiper

Anatomy of Cassandra's Write Operation

Anatomy of Cassandra's Read Operation

Compaction

Tombstones

Summary: Cassandra

Quiz: Cassandra

Mock Interview: Cassandra

Messaging Systems: Introduction

Kafka: Introduction

High-level Architecture

Kafka: Deep Dive

Consumer Groups

Kafka Workflow

Role of ZooKeeper

Controller Broker

Kafka Delivery Semantics

Kafka Characteristics

Summary: Kafka

Quiz: Kafka

Mock Interview: Kafka

Chubby: Introduction

High-level Architecture

Design Rationale

How Chubby Works

File, Directories, and Handles

Locks, Sequencers, and Lock-delays

Sessions and Events

Master Election and Chubby Events

Caching

Database

Scaling Chubby

Summary: Chubby

Quiz: Chubby

Mock Interview: Chubby

Hadoop Distributed File System: Introduction

High-level Architecture

Deep Dive

Anatomy of a Read Operation

Anatomy of a Write Operation

Data Integrity & Caching

Fault Tolerance

HDFS High Availability (HA)

HDFS Characteristics

Summary: HDFS

Quiz: HDFS

Mock Interview: HDFS

Google File System: Introduction

High-level Architecture

Single Master and Large Chunk Size

Metadata

Master Operations

Anatomy of a Read Operation

Anatomy of a Write Operation

Anatomy of an Append Operation

GFS Consistency Model and Snapshotting

Fault Tolerance, High Availability, and Data Integrity

Garbage Collection

Criticism on GFS

Summary: GFS

Quiz: GFS

Mock Interview: GFS

BigTable: Introduction

BigTable Data Model

System APIs

Partitioning and High-level Architecture

SSTable

GFS and Chubby

Bigtable Components

Working with Tablets

The Life of BigTable's Read & Write Operations

Fault Tolerance and Compaction

BigTable Refinements

BigTable Characteristics

Summary: BigTable

Quiz: BigTable

Mock Interview: BigTable

Design Reddit

Quiz

Designing a Notification System

Quiz

Design Google calendar (Medium)

Quiz

Design a Recommendation System for Netflix

Quiz

Design Gmail

Quiz

Design Google News, a Global News Aggregator System (Medium)

Quiz

Design Unique ID Generator (Easy)

Quiz

Design Code Judging System like LeetCode (Medium)

Quiz

Design Payment System

Quiz

Design a Flash Sale for an E-commerce Site (Hard)

Quiz

Design a Reminder Alert System

Quiz

Introduction: System Design Patterns

1. Bloom Filters

2. Consistent Hashing

3. Quorum

4. Leader and Follower

5. Write-ahead Log

6. Segmented Log

7. High-Water Mark

8. Lease

9. Heartbeat

10. Gossip Protocol

11. Phi Accrual Failure Detection

12. Split Brain

13. Fencing

14. Checksum

15. Vector Clocks

16. CAP Theorem

17. PACELC Theorem

18. Hinted Handoff

19. Read Repair

20. Merkle Trees

Quiz

What are Back-of-the-Envelope Estimations?

What are Back-of-the-Envelope Estimations?

back-of-the-envelope estimation

scalability

load estimation

storage estimation

+2

hard
·
13 min
·Updated Feb 2025

Back of the envelope estimations in system design interviews are like quick, rough calculations you might do on a napkin during lunch - they're not detailed or exact, but give you a good ballpark figure. These rough calculations help you quickly assess the feasibility of a proposed solution, estimate its performance, and identify potential bottlenecks.

Purpose

Back-of-the-envelope estimation is a technique used to quickly approximate values and make rough calculations using simple arithmetic and basic assumptions. This method is particularly useful in system design interviews, where interviewers expect candidates to make informed decisions and trade-offs based on rough estimates.

Why is Estimation Important in System Design Interviews?

During a system design interview, you’ll be asked to design a scalable and reliable system based on a set of requirements. Your ability to make quick estimations is essential for several reasons:

  1. Indicates System Scalability: Highlights your understanding of how the system can grow or adapt.
  2. Validate proposed solutions: Estimation helps you ensure that your proposed architecture meets the requirements and can handle the expected load.
  3. Identify bottlenecks: Quick calculations help you identify potential performance bottlenecks and make necessary adjustments to your design.
  4. Demonstrate your thought process: Estimation showcases your ability to make informed decisions and trade-offs based on a set of assumptions and constraints.
  5. Communicate effectively: Providing estimates helps you effectively communicate your design choices and their implications to the interviewer.
  6. Quick Decision Making: Reflects your ability to make swift estimations to guide your design decisions.

Estimation Techniques

1. Rule of thumb

Rules of thumb are general guidelines or principles that can be applied to make quick and reasonably accurate estimations. They are based on experience and observation, and while not always precise, they can provide valuable insights in the absence of detailed information. For example, estimating that a user will generate 1 MB of data per day on a social media platform can serve as a starting point for capacity planning.

2. Approximation

Approximation involves simplifying complex calculations by rounding numbers or using easier-to-compute values. This technique can help derive rough estimates quickly and with minimal effort. For instance, assuming 1,000 users instead of 1,024 when estimating storage requirements can simplify calculations and still provide a reasonable approximation.

3. Breakdown and aggregation

Breaking down a problem into smaller components and estimating each separately can make it easier to derive an overall estimate. This technique involves identifying the key components of a system, estimating their individual requirements, and then aggregating these estimates to determine the total system requirements. For example, estimating the storage needs for user data, multimedia content, and metadata separately can help in determining the overall storage requirements of a social media platform.

4. Sanity check

A sanity check is a quick evaluation of an estimate to ensure its plausibility and reasonableness. This step helps identify potential errors or oversights in the estimation process and can lead to more accurate and reliable results. For example, comparing the estimated storage requirements for a messaging service with the actual storage used by a similar existing service can help validate the estimate.

Types of Estimations in System Design Interviews

In system design interviews, there are several types of estimations you may need to make:

  1. Load estimation: Predict the expected number of requests per second, data volume, or user traffic for the system.
  2. Storage estimation: Estimate the amount of storage required to handle the data generated by the system.
  3. Bandwidth estimation: Determine the network bandwidth needed to support the expected traffic and data transfer.
  4. Latency estimation: Predict the response time and latency of the system based on its architecture and components.
  5. Resource estimation: Estimate the number of servers, CPUs, or memory required to handle the load and maintain desired performance levels.

Process

  1. Understand the Scope: Clarify the scale of the problem - how many users, how much data, etc.
  2. Use Simple Math: Utilize basic arithmetic to estimate the scale of data and resources.
  3. Round Numbers for Simplicity: Use round numbers to make calculations easier and faster.
  4. Be Logical and Reasonable: Ensure your estimations make sense given the context of the problem.

Practical Examples

1. Load Estimation

Suppose you’re asked to design a social media platform with 100 million daily active users (DAU) and an average of 10 posts per user per day. To estimate the load, you’d calculate the total number of posts generated daily:

100 million DAU * 10 posts/user = 1 billion posts/day

Then, you can estimate the request rate per second:

1 billion posts/day / 86,400 seconds/day ≈ 11,574 requests/second

2. Storage Estimation

Consider a photo-sharing app with 500 million users and an average of 2 photos uploaded per user per day. Each photo has an average size of 2 MB. To estimate the storage required for one day’s worth of photos, you’d calculate:

500 million users * 2 photos/user * 2 MB/photo = 2,000,000,000 MB/day

3. Bandwidth Estimation

For a video streaming service with 10 million users streaming 1080p videos at 4 Mbps, you can estimate the required bandwidth:

10 million users * 4 Mbps = 40,000,000 Mbps

4. Latency Estimation

Suppose you’re designing an API that fetches data from multiple sources, and you know that the average latency for each source is 50 ms, 100 ms, and 200 ms, respectively. If the data fetching process is sequential, you can estimate the total latency as follows:

50 ms + 100 ms + 200 ms = 350 ms

If the data fetching process is parallel, the total latency would be the maximum latency among the sources:

max(50 ms, 100 ms, 200 ms) = 200 ms

5. Resource Estimation

Imagine you’re designing a web application that receives 10,000 requests per second, with each request requiring 10 ms of CPU time. To estimate the number of CPU cores needed, you can calculate the total CPU time per second:

10,000 requests/second * 10 ms/request = 100,000 ms/second

Assuming each CPU core can handle 1,000 ms of processing per second, the number of cores required would be:

100,000 ms/second / 1,000 ms/core = 100 cores

System Design Examples

1. Designing a messaging service

Imagine you are tasked with designing a messaging service similar to WhatsApp. To estimate the system’s requirements, you can start by considering the following aspects:

  • Number of users: Estimate the total number of users for the platform. This can be based on market research, competitor analysis, or historical data.
  • Messages per user per day: Estimate the average number of messages sent by each user per day. This can be based on user behavior patterns or industry benchmarks.
  • Message size: Estimate the average size of a message, considering text, images, videos, and other media content.
  • Storage requirements: Calculate the total storage needed to store messages for a specified retention period, taking into account the number of users, messages per user, message size, and data redundancy.
  • Bandwidth requirements: Estimate the bandwidth needed to handle the message traffic between users, considering the number of users, messages per user, and message size.

By breaking down the problem into smaller components and applying estimation techniques, you can derive a rough idea of the messaging service’s requirements, which can guide your design choices and resource allocation.

2. Designing a video streaming platform

Suppose you are designing a video streaming platform similar to Netflix. To estimate the system’s requirements, consider the following aspects:

  • Number of users: Estimate the total number of users for the platform based on market research, competitor analysis, or historical data.
  • Concurrent users: Estimate the number of users who will be streaming videos simultaneously during peak hours.
  • Video size and bitrate: Estimate the average size and bitrate of videos on the platform, considering various resolutions and encoding formats.
  • Storage requirements: Calculate the total storage needed to store the video content, taking into account the number of videos, their sizes, and data redundancy.
  • Bandwidth requirements: Estimate the bandwidth needed to handle the video streaming traffic, considering the number of concurrent users, video bitrates, and user locations.

By applying estimation techniques and aggregating the individual estimates, you can get a ballpark figure of the video streaming platform’s requirements, which can inform your design decisions and resource allocation.

Tips for Successful Estimation in Interviews

Estimation plays a crucial role in system design interviews, as it helps you make informed decisions about your design and demonstrates your understanding of the various factors that impact the performance and scalability of a system. Here are some tips to help you ace the estimation part of your interviews:

1. Break down the problem

When faced with a complex system design problem, break it down into smaller, more manageable components. This will make it easier to estimate each component’s requirements and help you understand how they interact with each other. By identifying the key components and estimating their requirements separately, you can then aggregate your estimates to get a comprehensive view of the system’s needs.

2. Use reasonable assumptions

During an interview, you may not have all the necessary information to make precise estimations. In such cases, make reasonable assumptions based on your knowledge of similar systems, industry standards, or user behavior patterns. Clearly state your assumptions to the interviewer, as this demonstrates your thought process and enables them to provide feedback or correct your assumptions if necessary.

Operation NameTime
L1 cache reference0.5 ns
Branch mispredict5 ns
L2 cache reference7 ns
Mutex lock/unlock100 ns
Main memory reference100 ns
Compress 1K bytes with Zippy10,000 ns = 10 μs
Send 2K bytes over 1 Gbps network20,000 ns = 20 μs
Read 1 MB sequentially from memory250,000 ns = 250 μs
Round trip within the same datacenter500,000 ns = 500 μs
Disk seek10,000,000 ns = 10 ms
Read 1 MB sequentially from network10,000,000 ns = 10 ms
Read 1 MB sequentially from disk30,000,000 ns = 30 ms
Send packet CA→Netherlands→CA150,000,000 ns = 150 ms

3. Leverage your experience

Drawing from your past experiences can be beneficial when estimating system requirements. If you have worked on similar systems or have experience with certain technologies, use that knowledge to inform your estimations. This will not only help you make more accurate estimations but also showcase your expertise to the interviewer.

4. Be prepared to adjust your estimations

As you progress through the interview, the interviewer may provide additional information or challenge your assumptions, requiring you to adjust your estimations. Be prepared to adapt and revise your estimations accordingly. This demonstrates your ability to think critically and shows that you can handle changing requirements in a real-world scenario.

5. Don’t Forget to Ask Clarifying Questions

Don’t hesitate to ask the interviewer clarifying questions if you’re unsure about a requirement or assumption. This will help you avoid making incorrect estimations and showcase your problem-solving abilities.

6. Communicate your thought process

Throughout the estimation process, communicate your thought process clearly to the interviewer. Explain how you arrived at your estimations and the assumptions you made along the way. This allows the interviewer to understand your reasoning, provide feedback, and assess your problem-solving skills.

Conclusion

Back-of-the-envelope estimations are crucial in system design interviews as they showcase your ability to grasp the scale of a system quickly and assess the feasibility and resource needs of your design. It's a skill that demonstrates both technical knowledge and practical problem-solving ability.

Mark as read
PreviousFunctional vs. Non-functional Requirements
NextThings to Avoid During System Design Interview
Discussion
Have a question or insight about this topic? Share it with the community.
Reading Progress
0%

On This Page