The System Design Interview Roadmap

Grokking the System Design InterviewGrokking the System Design Interview is the industry’s most trusted System Design Interview course, featuring video lectures, over 30,000 five-star reviews, and more than 140,000 learners.

Beginner

20 h

66 lessons

Show Contents

Introduction to System Design Interview

4 lessons

What is a System Design Interview?

Understand the purpose of system design interviews, focusing on scalability, architecture decisions, trade-offs, and real-world problem-solving skills.

Functional vs. Non-functional Requirements

Master functional vs. non-functional requirements in system design, emphasizing scalability, performance, and trade-offs for building robust, real-world systems.

What are Back-of-the-Envelope Estimations?

Master back-of-the-envelope estimation to assess scalability, performance, and trade-offs using quick calculations in system design interviews.

Things to Avoid During System Design Interview

Learn what to avoid in system design interviews—like skipping trade-offs, ignoring requirements, poor communication, and rigid thinking.

Glossary of System Design Basics

20 lessons

System Design Basics

Get familiar with core system design concepts, key architectural components, and building blocks of scalable, distributed software systems.

Key Characteristics of Distributed Systems

Explore key characteristics of distributed systems, including scalability, reliability, availability, efficiency, fault tolerance, and manageability.

Load Balancing

Learn how load balancing improves scalability and availability by distributing traffic using smart algorithms, health checks, and redundancy techniques.

Load Balancing Algorithms

Caching

Learn caching strategies to improve system performance, including cache types, eviction policies, read/write methods, and cache invalidation techniques.

Data Partitioning

Learn data partitioning techniques—horizontal, vertical, hybrid—and partitioning criteria to scale databases, improve performance, and balance workloads.

Indexes

Understand how database indexes optimize query performance, enable efficient lookups, and impact write operations in large-scale systems.

Proxies

Learn how forward and reverse proxies improve security, caching, and traffic control by mediating communication between clients and servers.

Redundancy and Replication

Learn how redundancy and replication improve system reliability and availability through failover strategies and synchronous, asynchronous, and semi-synchronous replication.

SQL vs. NoSQL

Compare relational and non-relational databases to understand schema flexibility, scalability, ACID compliance, and ideal use cases for each.

CAP Theorem

Understand why distributed systems must trade off between consistency, availability, and partition tolerance—only two of the three can be guaranteed at any time.

PACELC Theorem

Learn how PACELC extends CAP by showing that even without partitions, distributed systems must balance latency and consistency in replicated environments.

Consistent Hashing

Learn the architecture of scalable systems using consistent hashing for efficient data partitioning, replication, and dynamic node management.

Long-Polling vs WebSockets vs Server-Sent Events

Understand the system design of real-time communication protocols for scalable push-based updates between client and server.

Bloom Filters

Explore space-efficient system design using Bloom filters for fast, probabilistic membership checks with minimal memory and no false negatives.

Quorum

Design highly available distributed systems using quorum-based consistency models to coordinate read/write operations across replicated nodes.

Leader and Follower

Understand the architecture of leader-based replication in distributed systems for consistent writes, fault tolerance, and coordinated data synchronization.

Heartbeat

Learn how distributed systems use heartbeat signals to detect server failures and maintain reliable request routing and availability.

Checksum

Explore how distributed systems ensure data integrity by using checksums to detect and prevent data corruption during transmission.

Quiz

Test your understanding of system design basic concepts.

System Design Trade-offs

New

22 lessons

Importance of Discussing Trade-offs

Demonstrating trade-offs shows maturity, critical thinking, and real-world design skills crucial for scalable, practical system architecture.

Strong vs Eventual Consistency

Explore consistency models in distributed systems, comparing strong and eventual consistency in terms of latency, availability, accuracy, and scalability trade-offs.

Latency vs Throughput

Understand the difference between latency and throughput, and explore strategies to optimize response time and data processing capacity in systems.

ACID vs BASE Properties in Databases

Understand key differences between ACID and BASE database models, balancing consistency, availability, scalability, and fault tolerance in distributed systems.

Read-Through vs Write-Through Cache

Explore read-through vs write-through caching strategies for improving read performance, data consistency, and integrity in scalable systems.

Batch Processing vs Stream Processing

Compare batch vs stream processing in system design—complexity, resource efficiency, continuous vs scheduled data flow, and real-time responsiveness.

Load Balancer vs. API Gateway

Understand differences between Load Balancer and API Gateway in scalable architectures—traffic distribution, request routing, API management, and availability.

API Gateway vs Direct Service Exposure

Explore API Gateway vs Direct Service Exposure in distributed systems—compare centralized routing and security with direct access and reduced latency.

Proxy vs. Reverse Proxy

Understand how proxies and reverse proxies manage client and server traffic in distributed system design to improve security, caching, and scalability.

API Gateway vs. Reverse Proxy

Learn the difference between API Gateways and Reverse Proxies in distributed systems, focusing on routing, security, load balancing, and orchestration.

SQL vs. NoSQL

Compare SQL vs NoSQL databases based on schema structure, scalability, ACID compliance, flexibility, and best use cases in system design.

Primary-Replica vs Peer-to-Peer Replication

Understand Primary-Replica vs Peer-to-Peer replication models, including data flow, consistency, scalability, fault tolerance, and use cases.

Data Compression vs Data Deduplication

Learn the differences between data compression and deduplication for optimizing storage—focus, scope, efficiency, and best-use scenarios.

Server-Side Caching vs Client-Side Caching

Understand server-side vs client-side caching—how they differ in location, control, performance impact, and best use across web apps.

REST vs RPC

Learn the architectural differences between REST and RPC, including stateless resource handling vs. procedure calls, flexibility, scalability, and performance trade-offs.

Polling vs. Long-Polling vs. WebSockets vs. Webhooks

Compare real-time delivery strategies—Polling, Long-Polling, WebSockets, and Webhooks—for scalable, event-driven system communication.

CDN Usage vs Direct Server Serving

Learn when to use a Content Delivery Network vs direct server hosting based on traffic scale, geographic distribution, and caching efficiency.

Serverless Architecture vs Traditional Server-based

Learn serverless system design benefits like dynamic scaling and reduced ops versus traditional server hosting with full control and overhead.

Stateful vs Stateless Architecture

Explore the architecture of stateful vs stateless systems, comparing session handling, scalability, and design trade-offs for APIs and web services.

Hybrid Cloud Storage vs All-Cloud Storage

Explore hybrid cloud vs all-cloud storage architectures, balancing scalability, compliance, and control for secure, flexible enterprise data strategies.

Token Bucket vs Leaky Bucket

Learn how Token Bucket and Leaky Bucket algorithms manage network traffic shaping, rate limiting, and handling of bursty data.

Read Heavy vs Write Heavy System

Explore how system design varies for read-heavy and write-heavy workloads with techniques like batching, replication, and data partitioning.

System Design Problems

18 lessons

System Design Interviews - A step by step guide

System Design Master Template

Designing a URL Shortening Service like TinyURL

Designing Pastebin

Designing Instagram

Designing Dropbox

Designing Facebook Messenger

Designing Twitter

Designing Youtube or Netflix

Designing Typeahead Suggestion

Designing an API Rate Limiter

Designing Twitter Search

Designing a Web Crawler

Designing Facebook’s Newsfeed

Designing Yelp or Nearby Friends

Designing Uber backend

Designing Ticketmaster

Additional Resources

Appendix

2 lessons

Other courses

Grokking the Advanced System Design InterviewLearn system design through architectural review of real systems.

Advanced

21 h

118 lessons

Show Contents

Introduction

1 lesson

What Is This Course About?

Dynamo: How to design a key value store?

12 lessons

Dynamo: Introduction

High-Level Architecture

Data Partitioning

Replication

Vector Clocks and Conflicting Data

The Life of Dynamo’s put() & get() Operations

Anti-entropy Through Merkle Trees

Gossip Protocol

Dynamo Characteristics and Criticism

Summary: Dynamo

Quiz: Dynamo

Mock Interview: Dynamo

Cassandra: How to Design a Wide-column NoSQL Database?

12 lessons

Cassandra: Introduction

High-level Architecture

Replication

Cassandra Consistency Levels

Gossiper

Anatomy of Cassandra's Write Operation

Anatomy of Cassandra's Read Operation

Mock Interview: Cassandra

Kafka: How to Design a Distributed Messaging System?

13 lessons

Messaging Systems: Introduction

Kafka: Introduction

High-level Architecture

Kafka Delivery Semantics

Kafka Characteristics

Summary: Kafka

Quiz: Kafka

Mock Interview: Kafka

Chubby: How to Design a Distributed Locking Service?

14 lessons

Chubby: Introduction

High-level Architecture

Design Rationale

How Chubby Works

File, Directories, and Handles

Locks, Sequencers, and Lock-delays

Sessions and Events

Master Election and Chubby Events

Mock Interview: Chubby

GFS: How to Design a Distributed File System Storage?

15 lessons

Google File System: Introduction

High-level Architecture

Single Master and Large Chunk Size

Metadata

Master Operations

Anatomy of a Read Operation

Anatomy of a Write Operation

Anatomy of an Append Operation

GFS Consistency Model and Snapshotting

Fault Tolerance, High Availability, and Data Integrity

HDFS: How to Design File Storage System?

12 lessons

Hadoop Distributed File System: Introduction

High-level Architecture

Deep Dive

Anatomy of a Read Operation

Anatomy of a Write Operation

Data Integrity & Caching

Fault Tolerance

HDFS High Availability (HA)

BigTable: How to Design a Wide Column Storage System?

15 lessons

BigTable: Introduction

BigTable Data Model

System APIs

Partitioning and High-level Architecture

The Life of BigTable's Read & Write Operations

Fault Tolerance and Compaction

BigTable Refinements

BigTable Characteristics

Summary: BigTable

Quiz: BigTable

Mock Interview: BigTable

10.

Final Assessment

2 lessons

Quiz I

Quiz II

11.

Appendix

1 lesson

Grokking Microservices Design PatternsMaster microservices design patterns for designing scalable, resilient, and more manageable systems.

Intermediate

60 h

1 playgrounds

93 lessons

Show Contents

Introduction

2 lessons

Who Should Take This Course?

The Course at a Glance

Strangler Fig Pattern

6 lessons

Introduction

The Problem: Legacy Systems

The Strangler Pattern: A Solution

The Architecture of the Strangler Pattern

Strangler Pattern: A Detailed Example

Key Insights and Implications

API Gateway Pattern

5 lessons

Introduction to the API Gateway Pattern

Advantages of API Gateway Pattern

API Gateway Pattern: An Example

Performance Implications

System Design Example

Backends for Frontends (BFF) Pattern

6 lessons

Introduction to BFF

The Problem: Traditional Backend Models

The Architecture of the BFF Pattern

BFF Pattern: An Example

Performance Implications

System Design Examples

Service Discovery Pattern

9 lessons

What is Service Discovery Pattern?

The Problem: Service Coordination in Distributed Systems

Service Discovery Pattern: A Solution

The Architecture of the Service Discovery Pattern

The Inner Workings of the Service Discovery Pattern

Service Discovery Pattern: An Example

Performance Implications and Special Considerations

System Design Examples

Security Considerations

Circuit Breaker Pattern

7 lessons

Introduction

The Problem: The Struggles of Distributed Systems and Service Failures

The Circuit Breaker Pattern: An Effective Shield Against Cascading Failures

Circuit Breaker Pattern: An Example

Performance Implications and Special Considerations

System Design Examples

Summary

Bulkhead Pattern

9 lessons

Introduction

The Problem: Failure Propagation in Distributed Systems

The Bulkhead Pattern: A Solution

The Architecture

The Inner Workings

Bulkhead Pattern: A Example

Performance Implications and Special Considerations

System Design Examples

Conclusion

Retry Pattern

7 lessons

Introduction

The Retry Pattern: A Solution to Unreliable External Resources

The Architecture of the Retry Pattern

Retry Pattern: An Example

Performance Implications

Use Cases and System Design Examples

Conclusion

Sidecar Pattern

7 lessons

Introduction to the Sidecar Pattern

The Problem: Monolithic Application Management

A Solution to the Monolithic Mayhem

The Architecture of the Sidecar Pattern

Sidecar Pattern: Bringing Theory to Practice with an Example

Performance Implications

System Design Examples: Bringing the Sidecar Pattern to Life

10.

Saga Pattern

9 lessons

Introduction to Saga Pattern

The Problem: Traditional Transaction Models

The Saga Pattern: A Solution

The Architecture of the Saga Pattern

The Inner Workings of the Saga Pattern

Saga Pattern: A Example

Performance Implications

System Design Examples

Conclusion

11.

Event-Driven Architecture Pattern

9 lessons

Introduction

The Problem: Managing Complex Interactions in Distributed Systems

Event-Driven Architecture: A Promising Solution

The Architecture of the Event-Driven Architecture Pattern

The Inner Workings of the Event-Driven Architecture Pattern

Event-Driven Architecture Pattern: An Example

Performance Implications and Special Considerations

Use Cases and System Design Examples

Conclusion

12.

CQRS (Command Query Responsibility Segregation)

8 lessons

Introduction

The Problem: Traditional CRUD Operations

CQRS Pattern: A Solution

The Architecture of the CQRS Pattern

The Inner Workings of the CQRS Pattern

CQRS Pattern: An Example

Issues, Special Considerations, and Performance Implications

System Design Examples

13.

Configuration Externalization Pattern

8 lessons

Introduction

The Problem: Configuration Management in a Microservices Architecture

The Solution: Configuration Externalization Pattern

Unveiling the Architecture: How Does Configuration Externalization Work?

Delving into Code: An Example

Considerations and Implications

Use Cases and Real-world Examples

Conclusion

14.

Course Wrap-up

1 lesson

Embrace the Future of Software Architecture

Grokking Design Patterns for Engineers and ManagersUnlock the power of design patterns: Elevate your coding skills with timeless solutions for top-notch software design.

Beginner

25 h

22 playgrounds

31 lessons

Show Contents