System Design

Learn System Design

How to Learn System Design?

Functional vs. Non-functional Requirements

What are Back-of-the-Envelope Estimations?

Things to Avoid During System Design Interview

System Design Basics

Introduction to Load Balancing

Load Balancing Algorithms

Uses of Load Balancing

Load Balancer Types

Stateless vs. Stateful Load Balancing

High Availability and Fault Tolerance

Scalability and Performance

Challenges of Load Balancers

Introduction to API Gateway

Usage of API gateway

Advantages and disadvantages of using API gateway

Scalability

Availability

Latency and Performance

Concurrency and Coordination

Monitoring and Observability

Resilience and Error Handling

Fault Tolerance vs. High Availability

HTTP vs. HTTPS

TCP vs. UDP

HTTP: 1.0 vs. 1.1 vs 2.0 vs. 3.0

URL vs. URI vs. URN

Introduction to DNS

DNS Resolution Process

DNS Load Balancing and High Availability

Introduction to Caching

Why is Caching Important?

Types of Caching

Cache Replacement Policies

Cache Invalidation

Cache Read Strategies

Cache Coherence and Consistency Models

Caching Challenges

Cache Performance Metrics

What is CDN?

Origin Server vs. Edge Server

CDN Architecture

Push CDN vs. Pull CDN

Introduction to Data Partitioning

Partitioning Methods

Data Sharding Techniques

Benefits of Data Partitioning

Common Problems Associated with Data Partitioning

What is a Proxy Server?

Uses of Proxies

VPN vs. Proxy Server

What is Redundancy?

What is Replication?

Replication Methods

Data Backup vs. Disaster Recovery

Introduction to CAP Theorem

Components of CAP Theorem

Trade-offs in CAP Theorem

Examples of CAP Theorem in Practice

Beyond CAP Theorem

System Design Trade-offs in Interviews

Introduction to Databases

SQL Databases

NoSQL Databases

SQL vs. NoSQL

ACID vs BASE Properties

Real-World Examples and Case Studies

SQL Normalization and Denormalization

In-Memory Database vs. On-Disk Database

Data Replication vs. Data Mirroring

Database Federation

What are Indexes?

Types of Indexes

Introduction to Bloom Filters

Benefits & Limitations of Bloom Filters

Variants and Extensions of Bloom Filters

Applications of Bloom Filters

Difference Between Long-Polling, WebSockets, and Server-Sent Events

Why Quorum?

What is Quorum?

What is Heartbeat?

What is Checksum?

Uses of Checksum

What is Leader and Follower Pattern?

What is Security and Privacy?

What is Authentication?

What is Authorization?

Authentication vs. Authorization

OAuth vs. JWT for Authentication

What is Encryption?

What are DDoS Attacks?

Introduction to Messaging System

Introduction to Kafka

Messaging patterns

Popular Messaging Queue Systems

RabbitMQ vs. Kafka vs. ActiveMQ

Scalability and Performance

What is a Distributed File System?

Architecture of a Distributed File System

Key Components of a DFS

Batch Processing vs. Stream Processing

XML vs. JSON

Synchronous vs. Asynchronous Communication

Push vs. Pull Notification Systems

Microservices vs. Serverless Architecture

Message Queues vs. Service Bus

Stateful vs. Stateless Architecture

Event-Driven vs. Polling Architecture

Quiz

Importance of Discussing Trade-offs

Strong vs Eventual Consistency

Latency vs Throughput

ACID vs BASE Properties in Databases

Read-Through vs Write-Through Cache

Batch Processing vs Stream Processing

Load Balancer vs. API Gateway

API Gateway vs Direct Service Exposure

Proxy vs. Reverse Proxy

API Gateway vs. Reverse Proxy

SQL vs. NoSQL

Primary-Replica vs Peer-to-Peer Replication

Data Compression vs Data Deduplication

Server-Side Caching vs Client-Side Caching

REST vs RPC

Polling vs. Long-Polling vs. WebSockets vs. Webhooks

CDN Usage vs Direct Server Serving

Serverless Architecture vs Traditional Server-based

Stateful vs Stateless Architecture

Hybrid Cloud Storage vs All-Cloud Storage

Token Bucket vs Leaky Bucket

Read Heavy vs Write Heavy System

Quiz

System Design Interviews - A step by step guide

System Design Master Template

Designing a URL Shortening Service like TinyURL

Quiz - Designing URL Shortner

Designing Pastebin

Quiz - Designing Pastebin

Designing Instagram

Quiz - Designing Instagram

Designing Dropbox

Quiz - Designing Dropbox

Designing Facebook Messenger

Quiz - Designing Facebook Messenger

Designing Twitter

Quiz - Designing Twitter

Designing Youtube or Netflix

Quiz - Designing Youtube

Designing Typeahead Suggestion

Quiz - Designing Typeahead Suggestion

Designing an API Rate Limiter

Quiz - Designing an API Rate Limiter

Designing Twitter Search

Quiz - Designing Twitter Search

Designing a Web Crawler

Quiz - Designing a Web Crawler

Designing Facebook’s Newsfeed

Quiz - Designing Facebook’s Newsfeed

Designing Yelp or Nearby Friends

Quiz - Designing Yelp or Nearby Friends

Designing Uber backend

Quiz - Designing Uber backend

Designing Ticketmaster

Quiz - Designing Ticketmaster

Dynamo: Introduction

High-Level Architecture

Data Partitioning

Replication

Vector Clocks and Conflicting Data

The Life of Dynamo’s put() & get() Operations

Anti-entropy Through Merkle Trees

Gossip Protocol

Dynamo Characteristics and Criticism

Summary: Dynamo

Quiz: Dynamo

Mock Interview: Dynamo

YouTube Likes Counter

Quiz

Cassandra: Introduction

High-level Architecture

Replication

Cassandra Consistency Levels

Gossiper

Anatomy of Cassandra's Write Operation

Anatomy of Cassandra's Read Operation

Compaction

Tombstones

Summary: Cassandra

Quiz: Cassandra

Mock Interview: Cassandra

Messaging Systems: Introduction

Kafka: Introduction

High-level Architecture

Kafka: Deep Dive

Consumer Groups

Kafka Workflow

Role of ZooKeeper

Controller Broker

Kafka Delivery Semantics

Kafka Characteristics

Summary: Kafka

Quiz: Kafka

Mock Interview: Kafka

Chubby: Introduction

High-level Architecture

Design Rationale

How Chubby Works

File, Directories, and Handles

Locks, Sequencers, and Lock-delays

Sessions and Events

Master Election and Chubby Events

Caching

Database

Scaling Chubby

Summary: Chubby

Quiz: Chubby

Mock Interview: Chubby

Hadoop Distributed File System: Introduction

High-level Architecture

Deep Dive

Anatomy of a Read Operation

Anatomy of a Write Operation

Data Integrity & Caching

Fault Tolerance

HDFS High Availability (HA)

HDFS Characteristics

Summary: HDFS

Quiz: HDFS

Mock Interview: HDFS

Google File System: Introduction

High-level Architecture

Single Master and Large Chunk Size

Metadata

Master Operations

Anatomy of a Read Operation

Anatomy of a Write Operation

Anatomy of an Append Operation

GFS Consistency Model and Snapshotting

Fault Tolerance, High Availability, and Data Integrity

Garbage Collection

Criticism on GFS

Summary: GFS

Quiz: GFS

Mock Interview: GFS

BigTable: Introduction

BigTable Data Model

System APIs

Partitioning and High-level Architecture

SSTable

GFS and Chubby

Bigtable Components

Working with Tablets

The Life of BigTable's Read & Write Operations

Fault Tolerance and Compaction

BigTable Refinements

BigTable Characteristics

Summary: BigTable

Quiz: BigTable

Mock Interview: BigTable

Design Reddit

Quiz

Designing a Notification System

Quiz

Design Google calendar (Medium)

Quiz

Design a Recommendation System for Netflix

Quiz

Design Gmail

Quiz

Design Google News, a Global News Aggregator System (Medium)

Quiz

Design Unique ID Generator (Easy)

Quiz

Design Code Judging System like LeetCode (Medium)

Quiz

Design Payment System

Quiz

Design a Flash Sale for an E-commerce Site (Hard)

Quiz

Design a Reminder Alert System

Quiz

Introduction: System Design Patterns

1. Bloom Filters

2. Consistent Hashing

3. Quorum

4. Leader and Follower

5. Write-ahead Log

6. Segmented Log

7. High-Water Mark

8. Lease

9. Heartbeat

10. Gossip Protocol

11. Phi Accrual Failure Detection

12. Split Brain

13. Fencing

14. Checksum

15. Vector Clocks

16. CAP Theorem

17. PACELC Theorem

18. Hinted Handoff

19. Read Repair

20. Merkle Trees

Quiz

How to Learn System Design?

How to Learn System Design?

easy
·
10 min

This is a complete guide to learn system design and prepare for system design interviews, structured for engineers at every level. Whether you're a junior engineer facing your first system design round or a staff candidate at a top-tier company, the path is the same: master the foundations, study how patterns compose, then practice. What changes by level is how deep you go.

For levels: Junior → Mid → Senior → Staff
Time investment: 4-24 weeks (level-dependent)

00. Quick Orientation

Most "learn system design" guides target a single audience and miss everyone else. A junior engineer doesn't need to know how to design a globally distributed payment system. A staff candidate doesn't need to be told what a load balancer is. This guide acknowledges the difference: it walks through what's true at every level, then breaks out level-specific expectations so you can calibrate your prep.

A few things to know before you commit to this:

  • Time investment scales with level. Junior prep can be done in 4-6 weeks of focused study. Senior prep typically takes 12-16 weeks. Staff prep is closer to ongoing development of opinions over months. The path is the same; the depth and volume of practice change.
  • The interview tests reasoning, not memorization. Strong candidates at every level reason about tradeoffs given requirements. The single biggest mistake is trying to memorize "the right design for X". Interviewers detect this immediately and probe until your reasoning falls apart.
  • Practice is what makes the knowledge stick. Reading is necessary but not sufficient. The plan below alternates structured reading with deliberate practice; the practice is what closes the gap between "I've heard of this" and "I can do this on a whiteboard under pressure."
  • Use the level guide. Skip ahead to what each level expects if you're unsure where to calibrate. The bar at L4 is dramatically different from L6; preparing as if they're the same wastes time at one end and underprepares you at the other.

What follows: the reframe of what interviews actually test, the level-by-level expectations from junior to staff, prerequisites, the three-phase learning path, practical advice on how to practice, the most common pitfalls, curated resources, and a concrete next step.

01. What System Design Interviews Actually Test

The most important thing to understand before you start preparing: system design interviews are not testing whether you've memorized 10 architectures. They're testing whether you can reason about tradeoffs given requirements.

If you read enough prep material, you'll get the impression that there's a canonical answer for each question. "For chat, use WebSockets and at-least-once delivery." "For social feeds, use push fan-out with a hybrid for celebrities." "For dispatch, use geohashing with multi-objective scoring." These are real patterns and they show up in real designs, but treating them as the answer is the wrong mental model. The answer depends on the requirements, and interviewers deliberately vary the requirements to see whether you actually understand the tradeoffs or just memorized a canonical solution.

A more useful mental model: a system design interview is a structured conversation in which you decompose a problem into constraints, map constraints to a small set of architectural primitives (caching, sharding, queues, replication, indexing, and so on), and reason about which primitives compose into a defensible design. The primitives are the vocabulary; the requirements are the sentence; the design is the prose you produce in real time.

What interviewers actually score on

Across companies and levels, the signals interviewers consistently weight:

  • Can you clarify requirements before designing? Strong candidates spend the first five minutes asking targeted questions: scope, scale, latency, consistency, edge cases. Weak candidates jump to architecture before anyone has stated the problem.
  • Can you decompose at the right level of abstraction? Strong candidates produce a defensible high-level architecture (5-8 components) before drilling into any one of them. Weak candidates either get stuck at the abstract level or burn early time on a single component.
  • Can you reason about tradeoffs? When asked "why this database?" strong candidates name the alternative and articulate the criterion. Weak candidates default to "because that's what's used for this kind of system."
  • Can you go deep on the hardest 1-2 problems? Strong candidates protect their deep-dive time and use it on what actually matters (the celebrity problem, the consistency boundary, the failure mode). Weak candidates spread time evenly across components, going deep on nothing.
  • Can you recognize and articulate failure modes? Senior and staff candidates name what would break under stress before the interviewer asks. This signals operational thinking.

Notice what's not on this list: knowing every database in existence, having built systems at FAANG scale, or having read every distributed systems paper. Interviewers assume some baseline knowledge and test reasoning on top of it. The baseline scales with level (covered below); the reasoning tests are similar across levels.

The reframe in one sentence

System design interviews test reasoning about tradeoffs under constraints. The foundations are a small set of primitives (caching, sharding, replication, queues, indexing). Patterns are compositions of those primitives. Every design you'll produce is some composition chosen to satisfy the specific requirements. Internalize this and the rest of the prep becomes obvious.

02. What Each Level Expects

The most useful single piece of prep advice: calibrate to your level. The bar at L3 is dramatically lower than L6, and over-preparing or under-preparing wastes weeks. Below, the honest picture of what each level looks like in practice. Use this to set your scope and time investment.

Junior · L3 / IC1

~0-2 years experience · 4-6 weeks prep

System design rounds are often skipped at the junior level. When they're included, they're scoped narrowly and graded gently. The goal is to confirm you've thought about how systems work beyond a single function, not to test deep distributed systems knowledge.

Typical questionsDesign a URL shortener (TinyURL). Design a simple parking lot system. Design a basic chat app. Design a todo list backend. Scope: small, often single-server or modest scale.
Depth expectedClient/server basics, request/response, simple database schemas, basic indexes, the idea of caching. Not expected to handle millions of users or complex sharding.
Signal looked forCan you reason about a system at all? Do you understand what a database is for, what an API is, why we'd add a cache? The bar is "shows engineering judgment beyond writing functions."
Common pitfallsOver-engineering. Junior candidates who design like they're at staff level get penalized for adding complexity that doesn't fit the scope. Stay simple.

Mid-Level · L4 / IC2

~2-5 years experience · 8-12 weeks prep

System design rounds are standard at L4. Scope expands to apps with thousands to low millions of users. The bar is "you know the foundational primitives and can assemble them into a defensible design under interview time pressure."

Typical questionsDesign Twitter (basic version). Design WhatsApp (one-on-one chat). Design Instagram. Design a rate limiter. Design a notification system. Scope: medium, with explicit scale numbers in the requirements.
Depth expectedFoundational primitives: caching strategies, basic sharding, replication, queues, simple indexing. Familiarity with at least one canonical pattern (social feed, chat, rate limiting). Basic failure handling.
Signal looked forDo you know the building blocks? Can you defend choices ("Postgres because the workload is read-heavy and we need transactions")? Can you handle one or two follow-up probes without falling apart?
Common pitfallsMemorizing one design for each canonical question without understanding why. Interviewers detect this and tweak the requirements; if you can't adapt, the gap shows immediately.

Senior · L5 / IC3

~5-10 years experience · 12-16 weeks prep

Senior interviews are where system design becomes the dominant signal. Top-tier companies often run multiple system design rounds. Scope expands to systems serving millions or hundreds of millions of users. The bar is "fluent across all canonical patterns, deep in at least one area, articulate on tradeoffs."

Typical questionsDesign Twitter (full version, including the celebrity problem). Design Uber. Design Slack or Discord. Design YouTube or Netflix. Design Dropbox. Design Yelp or Foursquare.
Depth expectedAll canonical patterns. Deep tradeoff understanding (when push fan-out beats pull, when at-least-once is acceptable, when strong consistency is required). Strong on at least one specialized area: search, geospatial, streaming, real-time, payments, etc.
Signal looked forCan you handle deliberate ambiguity? Can you go deep when probed without losing the thread? Do you name failure modes before being asked? Can you articulate "why this not that" for every major choice?
Common pitfallsGoing wide instead of deep. Senior candidates who try to cover every component shallowly are weaker than those who pick the hardest 1-2 problems and demonstrate real depth there. Protect your deep-dive time.

Staff · L6+ / IC4+

~10+ years experience · ongoing development

Staff interviews shift the emphasis toward systems thinking and judgment. Pure architecture knowledge is assumed; what's tested is whether you can navigate ambiguity, set technical direction, and articulate the why behind hard calls. Questions often have no canonical answer.

Typical questionsOften product-shaped: design fraud detection for a payments platform, design a recommendation system for a streaming service, design a compliance audit pipeline. Or specialized: design a globally consistent payment system, design a multi-region search index.
Depth expectedStrong opinions on tradeoffs across all dimensions: consistency, availability, durability, scalability, evolvability, operational cost. Depth in multiple specialized areas. Comfort with cross-system concerns: organizational structure, migration paths, deprecation strategies.
Signal looked forCan you set technical direction? Can you make hard calls and defend them? Can you reason about second-order effects (what happens to the team's velocity, what's the migration cost, how does this evolve over five years)? Can you handle a question with no obvious right answer?
Common pitfallsTreating staff interviews like senior interviews with bigger numbers. The shift is qualitative, not quantitative. Staff candidates who stay in "architecture" mode without articulating organizational and evolutionary concerns underperform.

A note on level boundaries

Companies don't always agree on what's L4 vs L5 vs L6. The expectations above describe typical bars; specific companies will calibrate slightly differently. The most useful preparation isn't optimizing for a specific company's rubric but building genuine fluency at the level you're targeting. If you're solidly above bar at L5, you'll perform well at most companies' L5 interviews regardless of variance.

03. Prerequisites

You don't need a CS degree to learn system design, but you do need a few things in place. If any of these aren't true yet, address them first; trying to learn distributed systems while shaky on the basics is harder than it needs to be.

  • You can program at a working level. Not "expert," but you can read and write code, debug failures, work with a database. You've used at least one programming language enough to feel comfortable.
  • You understand HTTP at a working level. Requests and responses, status codes, headers, cookies. You don't need to know the spec by heart, but you should know what happens when you type a URL into a browser.
  • You understand databases at the API level. You've used a database (any database). You know what a query is, what an index does, why some queries are fast and others are slow. You don't need to be a DBA.
  • You have intuition about latency vs throughput. "How fast does one request complete" is different from "how many requests can the system handle per second." If this distinction is unfamiliar, spend a few hours with it before continuing.
  • You've used a multi-server system. Even casually. You've deployed something to two machines behind a load balancer, or you've used a cloud service that you know runs across multiple regions. The intuition that "one machine isn't enough" matters.

If any of these aren't yet true, the gap is bridgeable in a week or two of focused study. The cheapest path is usually a small project: deploy a web app to two cloud servers behind a load balancer, with a database. The act of getting it working teaches you what these abstractions actually mean.

What you don't need before starting: the canonical distributed systems theory (consensus, vector clocks, FLP impossibility), formal CAP theorem proofs, or deep networking internals. These are valuable but not prerequisites. They become more interesting once you've built intuition; trying to learn them cold is slow and the returns are weak.

04. The Three-Phase Learning Path

Regardless of your level, the structure of effective prep is the same: foundations first, then composition, then practice. What changes by level is how deep you go in each phase and how much practice volume you need before you're interview-ready.

Image
Learn System Design Path

Three phases, sequential. Foundations builds the vocabulary of primitives. Composition shows how the primitives assemble into recognizable patterns. Practice is where the knowledge becomes interview-ready. Each phase has explicit graduation criteria so you know when to move on. Volume scales with target level.

Phase 1: Build foundations in the primitives

Foundations · Master the architectural primitives ~2-6 weeks (level-dependent) · ~5-10 hours/week

The goal of phase 1 is deep familiarity with the architectural primitives that every system design composes. There are roughly ten of them. They're the vocabulary you'll use in every later phase.

The primitives, in the order most useful to learn:

  • Data tier first. Caching, then database selection, then sharding, then replication and consistency. The data tier is the load-bearing part of most systems; everything else depends on it.
  • Traffic tier next. Load balancing, then message queues, then rate limiting. These shape how requests flow through the system.
  • Operational tier third. Search and indexing, then observability. Search is its own pattern with specialized data structures; observability is what makes systems operable in production.
  • Modern tier last. Vector databases. Newer pattern, only relevant for AI-augmented systems, but increasingly probed in interviews at top companies.

Level-specific scope: Junior candidates need conversational familiarity with these (you can describe what each does and when you'd reach for it). Mid-level candidates need working understanding (you can defend tradeoff choices for at least the first six). Senior candidates need fluency across all ten with deep tradeoff awareness. Staff candidates need fluency plus opinions about second-order tradeoffs and operational implications.

For each primitive, do this: read a thorough deep-dive, then explain it back to someone (or out loud, or write a one-page summary). The articulation is what transfers it from passive recognition to active fluency. If you can't explain when to use a write-through cache versus a write-behind cache without notes, you don't yet know caching well enough; reread.

Spend most of your attention on the tradeoff sections, not the implementation details. Knowing that "Cassandra is eventually consistent" matters less than knowing "this workload tolerates staleness, so we trade strict consistency for write throughput." The former is trivia; the latter is engineering.

Graduation signal — you're ready when:

  • You can explain caching strategies (read-through, write-through, write-behind) and when to use each, without notes.
  • You can argue when to choose Postgres versus Cassandra versus Redis for a given workload.
  • You can sketch how sharding works on a whiteboard, including the hot-key problem and its mitigations.
  • You can explain at-least-once delivery, idempotent consumers, and why exactly-once is hard.
  • You can predict the rough behavior of a cache during a traffic spike or a database failover.

Phase 2: Study how primitives compose into patterns

Composition · Study the canonical patterns ~2-6 weeks (level-dependent) · ~5-10 hours/week

The goal of phase 2 is to see how the primitives from phase 1 compose into recognizable patterns. Most "design X" interview questions are not unique problems; they're variations on roughly ten canonical patterns. Once you've internalized the patterns, recognizing "this is a chat / messaging pattern" gives you a head start on the design.

The canonical patterns to study, ordered by interview frequency:

  • Social feed (Twitter, Instagram, Facebook). Push fan-out vs pull, the celebrity problem, hybrid strategies.
  • Chat / messaging (WhatsApp, Slack, Discord). Persistent connections, at-least-once delivery, group fan-out.
  • Geospatial dispatch (Uber, DoorDash). Location update pipeline, geohashing, multi-objective matching.
  • Video streaming (YouTube, Netflix). CDN architecture, transcoding pipelines, view counting.
  • Search (Google, e-commerce search). Inverted indexes, query processing, ranking.
  • E-commerce (Amazon, Shopify). Inventory management, payment processing, order pipeline.
  • Collaborative editing (Google Docs, Figma). Conflict resolution, operational transforms, real-time sync.
  • Notification systems (push, email at scale). Fan-out, delivery guarantees, throttling.
  • API platforms (Stripe, Twilio). Rate limiting, webhooks, idempotency, audit trails.
  • AI-augmented apps (RAG chatbots, AI search). Retrieval pipelines, vector databases, LLM cost management.

Level-specific scope: Junior candidates can usually skip most of phase 2; one or two patterns is enough for the questions you'll likely face. Mid-level candidates should know the first three to five patterns well. Senior candidates need fluency across all ten. Staff candidates need fluency plus the ability to recognize hybrid patterns and reason about novel compositions.

For each pattern, do this: read the framing and clarification questions, then pause and try to predict the architecture before reading the decomposition. Then check your prediction. Then read the deep-dives. After finishing, list the primitives the pattern composed and note any sub-decisions you didn't expect.

The goal isn't memorization; it's pattern recognition. By the end of phase 2, you should be able to look at a new "design X" question and immediately think: "this looks like a chat pattern with elements of social feed at the high-membership end." That recognition is what gives senior candidates their composure under interview pressure.

Graduation signal — you're ready when:

  • You can name the dominant pattern of a system from a one-sentence description.
  • You can identify which primitives compose into a given pattern's solution.
  • You can predict where a design would break before reading the answer.
  • You can explain why one pattern's decisions wouldn't fit another (push fan-out works for typical social feeds, breaks for very high follower counts; persistent connections work for chat, are overkill for low-traffic CRUD).
  • You can sketch a basic architecture for any of the canonical patterns from memory.

Phase 3: Practice by designing real systems

Practice · Mock interviews and timed drills Until interview-ready · ~2-4 hours per design exercise

Phase 3 is where the knowledge becomes interview-ready. Reading is necessary but not sufficient; designing systems yourself under time pressure is what turns "I've read about this" into "I can do this with a clock running and an interviewer probing." The goal of phase 3 is to internalize the four-step framework (clarify, decompose, deep-dive, evaluate) until it feels natural.

The cycle:

  1. Pick a system. Start with apps you use daily. Your favorite messaging app, the food delivery service you ordered from this week, the streaming service you watched yesterday. Picking systems you understand intuitively from the user side gives you better intuition for what the requirements actually are.
  2. Design from scratch under a timer. On paper or whiteboard. Walk through the four-step framework. Time-box yourself: 50 minutes total, divided 10/20/60/10 across clarify, decompose, deep-dive, evaluate. The time pressure matters; it forces decisions instead of endless deliberation.
  3. Defend the design to a critical reader. A peer, a study partner, or in writing. Walk them through the architecture. They should ask the kinds of questions an interviewer would: "Why this database?" "What if the cache fails?" "What happens at 10x scale?" If your answers are confident and grounded in tradeoffs, the design holds. If you find yourself making things up, note where and revisit.
  4. Compare to reality. Look up how the actual system is built. Engineering blogs (Uber Engineering, Discord, Cloudflare, Stripe) often publish technical deep-dives. Compare your design to theirs. Where you differ, ask why; sometimes you were wrong, sometimes you were right and they made different tradeoffs based on requirements you didn't know.
  5. Iterate or pick a new system. Either redesign the same system with different constraints (smaller scale, different consistency requirements, etc.) or pick a new system. Variety builds breadth; depth on individual systems builds intuition.

Level-specific volume: Junior candidates are typically interview-ready after 3-5 designs. Mid-level candidates should aim for 8-12 designs across the canonical patterns. Senior candidates should target 15-20 designs with at least 3-5 mock interviews against a real interviewer. Staff candidates should target 20+ designs and several mock interviews, with focus on the product-shaped and specialized questions that show up at staff level.

Mock interviews against an experienced interviewer (rather than self-practice) are the highest-leverage activity in phase 3. The feedback from a stranger who's seen many candidates is materially different from your own self-assessment. Budget for several mock interviews as you approach phase 3 graduation.

Graduation signal — you're interview-ready when:

  • You've designed enough systems end-to-end on paper that the framework feels automatic.
  • You can do a full 50-minute mock interview without panicking when an unfamiliar question comes up.
  • You can absorb a probing follow-up question, take 15 seconds to think, and respond with reasoning rather than memorized answers.
  • You can read an engineering blog post about a real system and predict the major design decisions before the post explains them.
  • You feel confident articulating the tradeoffs of any design choice without needing to look them up.

05. How to Practice Effectively

The biggest mistake at this stage is conflating reading with practicing. They're not the same. Active practice has specific techniques that turn passive knowledge into interview-ready fluency.

Active recall, not passive rereading

Reading the same primitive page three times feels productive but isn't. The thing that transfers knowledge from passive recognition to active fluency is recall: closing the page and explaining the concept in your own words. If you can articulate the tradeoffs of replication consistency without looking, you know it. If you can only nod along while reading, you don't.

The cheapest active-recall technique: after reading a primitive, write a 200-word summary in your own words. No looking back at the source. If you struggle, that's the signal you need to reread. If you can do it cleanly, you've internalized the concept.

Whiteboard or paper, not text editors

System design interviews happen on whiteboards (real or virtual). Practicing in a text editor builds different muscle memory than sketching boxes and arrows. The act of physically drawing components, thinking about where to place them, and connecting them with arrows is part of the skill being learned.

If you're remote, use a virtual whiteboard tool: Excalidraw, tldraw, or your interview platform's built-in tool. They're awkward at first; the awkwardness is part of what you need to overcome before the interview, not the night of.

Time-box your practice

Real interviews are 45-60 minutes. Practicing without a time limit is fine for early phase 1 work but counterproductive once you're in phase 3. The pressure of a clock is part of the skill; ignoring it during practice means the first time you experience it is during the interview itself.

A practical drill: set a 50-minute timer, pick a system, walk through the four-step framework. Don't pause the timer. When it goes off, stop wherever you are and review what you got done. This drill alone, repeated weekly, materially improves time discipline.

Defend designs to critical readers

The strongest practice is having someone else challenge your design. Find a study partner (other people preparing for similar interviews are easy to find online), trade designs, and ask each other hard questions. The critical reader doesn't need to be an expert; they just need to ask "why" repeatedly and refuse to accept handwaves.

If you can't find a study partner, mock interview platforms with experienced interviewers are a strong alternative. The feedback from someone who's seen many candidates is materially harder than self-review and reveals gaps you'd otherwise miss.

Mix breadth and depth

Some practice should be breadth: many different systems, designed quickly, to build pattern-recognition muscle. Some practice should be depth: one system designed multiple times under different constraints (small scale, large scale, different consistency requirements, with and without specific features). Both kinds of practice teach different things; alternate them.

The goal isn't to have read about every system. It's to have practiced designing enough systems that any new design feels like a familiar variation on patterns you already know.

06. Common Pitfalls

The mistakes that slow down most candidates. Each one is recoverable, but recognizing them early saves weeks.

Pitfall 1: Memorizing specific architectures instead of understanding tradeoffs

The temptation is to memorize "for chat, use WebSockets and Cassandra; for feed, use Kafka and Redis." This works in narrow cases but breaks the moment the interviewer changes the requirements. The signal of memorization-based knowledge: when asked "why this database?" the answer becomes "because that's what's used for this kind of system." A grounded answer would name the workload characteristics and how the chosen database serves them.

Pitfall 2: Skipping the requirements clarification step

"Design Twitter" can mean ten different systems depending on scope, scale, and edge cases. Strong candidates spend the first five minutes clarifying these. Weak candidates jump to architecture immediately. The clarify step isn't preliminaries before the real interview; it's a graded part of the interview where interviewers explicitly evaluate whether you can scope a problem before solving it.

Pitfall 3: Treating one tool as the answer to every problem

"Always use Postgres." "Kafka for everything." "Redis solves all caching." Each tool has a fit, and reaching for the same tool regardless of fit is a signal of shallow knowledge. The cure: when you find yourself defaulting to a familiar tool, force yourself to articulate the alternative and the criterion for choosing. If the criterion is "I know this one best," you're rationalizing, not designing.

Pitfall 4: Reading too much before practicing

It feels safer to "just read one more book" before designing systems yourself. Resist this. The point at which active practice becomes most valuable is when you feel slightly underprepared for it, not after you've absorbed every available resource. The discomfort of designing a system you don't fully understand is what produces the questions that drive the next round of focused reading. Forcing yourself to practice early gives reading direction.

Pitfall 5: Calibrating wrong for your level

Junior candidates over-engineer and get penalized for unnecessary complexity. Staff candidates stay in "architecture" mode without articulating organizational and evolutionary concerns. Both fail because they prepared at the wrong level. Use the level expectations section above to set scope; ask yourself "what would a strong [my level] candidate emphasize here?" before each practice session.

Pitfall 6: Going wide instead of deep at senior+ levels

Senior and staff candidates often try to cover every component of the system shallowly, hoping breadth will impress. It doesn't. Strong senior candidates pick the hardest 1-2 problems in the design and demonstrate real depth there. Going wide is a junior pattern; depth is the senior signal. Protect your deep-dive time.

Pitfall 7: Passive consumption of "design X" videos

YouTube has hundreds of "design Twitter" videos. Watching them feels productive but is one of the lowest-value forms of practice. The presenter has already done all the thinking; you're absorbing their conclusions without exercising your own reasoning. If you watch these videos, do it after you've designed the same system yourself, as a comparison. Watching first and then trying to design produces designs that are pale imitations of what you watched, not your own reasoning.

07. FAQ

I'm a junior engineer; do I really need to study system design?

It depends on the company and the specific role. Many junior interviews skip system design entirely. Where it does come up at L3, the bar is lower and the scope is narrower (URL shortener, basic chat, simple CRUD). Spending 4-6 weeks on the basics is usually sufficient. Don't over-prepare; junior candidates who design like senior engineers often get penalized for unnecessary complexity. Focus on demonstrating that you reason about systems, not that you've memorized advanced patterns you haven't used.

I'm at mid-level interviewing for senior; how should I adjust my prep?

The biggest gap between L4 and L5 is depth, not breadth. Mid-level candidates often know the canonical patterns at a surface level; senior candidates can go deep on any of them when probed. Spend extra time on phase 3 (practice) with explicit deep-dive drills: pick a pattern, then drill into one specific sub-problem (the celebrity problem in social feeds, the consistency boundary in chat, the cross-region split in dispatch) until you can talk about it for 10 minutes without notes. The senior bar is "can go three layers deeper than the surface answer."

What's the difference between senior and staff interviews?

Qualitative, not quantitative. Senior interviews ask "design X at scale" and grade architectural fluency. Staff interviews ask "design X" with deliberate ambiguity and grade systems thinking, judgment, and the ability to articulate tradeoffs across organizational and evolutionary dimensions. Staff candidates are expected to reason about second-order effects: how does this affect the team's velocity? What's the migration cost? How does this evolve over five years? Pure architecture knowledge is assumed; the test is what you do with it.

How long does prep really take at my level?

Honest ranges. Junior: 4-6 weeks of focused study. Mid-level: 8-12 weeks. Senior: 12-16 weeks, often longer if you're working full-time. Staff: ongoing, with intensified prep 3-6 weeks before specific interview cycles. Most candidates underestimate by 2-3x; the "I'll cram in two weeks" approach rarely works above junior level. The good news: progress is real even when it doesn't feel like it. By week 6 you'll notice you can read engineering blog posts at a different level.

Can I skip phase 1 if I've been working in industry?

Sometimes. Working engineers often have implicit knowledge of several primitives (you've used Postgres, you've used Redis, you've shipped to production with multiple servers) without the formal vocabulary or tradeoff awareness. The test: can you explain when to use a write-through cache versus a write-behind cache, with the exact failure modes of each? Can you describe the read-vs-write tradeoffs of replication strategies? If yes, skim phase 1 quickly to fill specific gaps. If no, do it in order.

How do I know I'm progressing if I'm self-studying?

The graduation signals at the end of each phase are designed for this. Don't move forward until you can demonstrably do everything in the graduation list. Concretely: if the phase 1 list says "you can argue when to choose Postgres versus Cassandra," try doing it out loud for 60 seconds. If you struggle, you're not done with phase 1 yet, regardless of how many primitive pages you've read. Mock interviews are the strongest signal at phase 3; if you can do 3-5 mocks at your target level with passing feedback, you're ready.

What if I'm preparing for a specific company's interview?

Companies have different interview emphases (some lean into low-level systems, some into product-shaped designs, some into specific patterns). Once you've completed phases 1 and 2, look up company-specific interview guides for the targeting; the underlying content is broadly the same, but knowing whether your interviewer will focus on consistency models versus distributed transactions versus product tradeoffs lets you weight your phase 3 practice accordingly.

How important are mock interviews?

Critical at senior and staff levels; helpful but optional at junior and mid-level. Mock interviews against an experienced interviewer surface gaps that self-practice can't: the kinds of follow-up probes interviewers use, time-pressure dynamics you don't simulate when practicing alone, and behavioral patterns under stress (do you talk too fast, freeze on hard questions, fail to clarify). Budget for 3-5 mocks before high-stakes interviews. The cost is real; the leverage is high.

08. Concrete Next Step

The biggest mistake at this point is closing this page, intending to start tomorrow, and never actually starting. Avoid that by committing to one specific action right now.

If you're at phase 1: pick the first primitive (load balancing) and study it for 30 minutes. Read a deep-dive, take a break, then explain caching tradeoffs out loud or in writing without referring back. That single 30-minute exercise is worth more than another hour of planning your study schedule.

If you're already past phase 1: pick the first canonical pattern (social feed) and work through a complete walkthrough actively. Don't read it passively; pause at each step and predict what comes next, then check yourself. That's the phase 2 study mode.

If you're at phase 3: pick a system you used today, set a 50-minute timer, and design it. WhatsApp, Spotify, DoorDash, Notion, whichever. Don't perfect; commit. Compare to reality afterward. Schedule a mock interview within the next two weeks if you haven't already.

The path is straightforward; the only thing that prevents most candidates from finishing it is starting. Start now.

Mark as read
NextFunctional vs. Non-functional Requirements
Discussion
Have a question or insight about this topic? Share it with the community.
Reading Progress
0%

On This Page