System Design
Learn System Design
Introduction to System Design
How to Learn System Design?
Functional vs. Non-functional Requirements
What are Back-of-the-Envelope Estimations?
Things to Avoid During System Design Interview
System Design Basics
Load Balancing
Introduction to Load Balancing
Load Balancing Algorithms
Uses of Load Balancing
Load Balancer Types
Stateless vs. Stateful Load Balancing
High Availability and Fault Tolerance
Scalability and Performance
Challenges of Load Balancers
API Gateway
Introduction to API Gateway
Usage of API gateway
Advantages and disadvantages of using API gateway
Key Characteristics of Distributed Systems
Scalability
Availability
Latency and Performance
Concurrency and Coordination
Monitoring and Observability
Resilience and Error Handling
Fault Tolerance vs. High Availability
Network Essentials
HTTP vs. HTTPS
TCP vs. UDP
HTTP: 1.0 vs. 1.1 vs 2.0 vs. 3.0
URL vs. URI vs. URN
Domain Name System (DNS)
Introduction to DNS
DNS Resolution Process
DNS Load Balancing and High Availability
Caching
Introduction to Caching
Why is Caching Important?
Types of Caching
Cache Replacement Policies
Cache Invalidation
Cache Read Strategies
Cache Coherence and Consistency Models
Caching Challenges
Cache Performance Metrics
CDN
What is CDN?
Origin Server vs. Edge Server
CDN Architecture
Push CDN vs. Pull CDN
Data Partitioning
Introduction to Data Partitioning
Partitioning Methods
Data Sharding Techniques
Benefits of Data Partitioning
Common Problems Associated with Data Partitioning
Proxies
What is a Proxy Server?
Uses of Proxies
VPN vs. Proxy Server
Redundancy and Replication
What is Redundancy?
What is Replication?
Replication Methods
Data Backup vs. Disaster Recovery
CAP & PACELC Theorems
Introduction to CAP Theorem
Components of CAP Theorem
Trade-offs in CAP Theorem
Examples of CAP Theorem in Practice
Beyond CAP Theorem
System Design Trade-offs in Interviews
Databases (SQL vs. NoSQL)
Introduction to Databases
SQL Databases
NoSQL Databases
SQL vs. NoSQL
ACID vs BASE Properties
Real-World Examples and Case Studies
SQL Normalization and Denormalization
In-Memory Database vs. On-Disk Database
Data Replication vs. Data Mirroring
Database Federation
Indexes
What are Indexes?
Types of Indexes
Bloom Filters
Introduction to Bloom Filters
Benefits & Limitations of Bloom Filters
Variants and Extensions of Bloom Filters
Applications of Bloom Filters
Long-Polling vs. WebSockets vs. Server-Sent Events
Difference Between Long-Polling, WebSockets, and Server-Sent Events
Quorum
Why Quorum?
What is Quorum?
Heartbeat
What is Heartbeat?
Checksum
What is Checksum?
Uses of Checksum
Leader and Follower
What is Leader and Follower Pattern?
Security
What is Security and Privacy?
What is Authentication?
What is Authorization?
Authentication vs. Authorization
OAuth vs. JWT for Authentication
What is Encryption?
What are DDoS Attacks?
Distributed Messaging System
Introduction to Messaging System
Introduction to Kafka
Messaging patterns
Popular Messaging Queue Systems
RabbitMQ vs. Kafka vs. ActiveMQ
Scalability and Performance
Distributed File Systems
What is a Distributed File System?
Architecture of a Distributed File System
Key Components of a DFS
Misc Concepts
Batch Processing vs. Stream Processing
XML vs. JSON
Synchronous vs. Asynchronous Communication
Push vs. Pull Notification Systems
Microservices vs. Serverless Architecture
Message Queues vs. Service Bus
Stateful vs. Stateless Architecture
Event-Driven vs. Polling Architecture
Quiz - System Design Fundamentals
Quiz
System Design Trade-offs
Importance of Discussing Trade-offs
Strong vs Eventual Consistency
Latency vs Throughput
ACID vs BASE Properties in Databases
Read-Through vs Write-Through Cache
Batch Processing vs Stream Processing
Load Balancer vs. API Gateway
API Gateway vs Direct Service Exposure
Proxy vs. Reverse Proxy
API Gateway vs. Reverse Proxy
SQL vs. NoSQL
Primary-Replica vs Peer-to-Peer Replication
Data Compression vs Data Deduplication
Server-Side Caching vs Client-Side Caching
REST vs RPC
Polling vs. Long-Polling vs. WebSockets vs. Webhooks
CDN Usage vs Direct Server Serving
Serverless Architecture vs Traditional Server-based
Stateful vs Stateless Architecture
Hybrid Cloud Storage vs All-Cloud Storage
Token Bucket vs Leaky Bucket
Read Heavy vs Write Heavy System
Quiz
System Design Master Template
System Design Interviews - A step by step guide
System Design Master Template
Designing a URL Shortening Service like TinyURL
Designing a URL Shortening Service like TinyURL
Quiz - Designing URL Shortner
Designing Pastebin
Designing Pastebin
Quiz - Designing Pastebin
Designing Instagram
Designing Instagram
Quiz - Designing Instagram
Designing Dropbox
Designing Dropbox
Quiz - Designing Dropbox
Designing Facebook Messenger
Designing Facebook Messenger
Quiz - Designing Facebook Messenger
Designing Twitter
Designing Twitter
Quiz - Designing Twitter
Designing Youtube or Netflix
Designing Youtube or Netflix
Quiz - Designing Youtube
Designing Typeahead Suggestion
Designing Typeahead Suggestion
Quiz - Designing Typeahead Suggestion
Designing an API Rate Limiter
Designing an API Rate Limiter
Quiz - Designing an API Rate Limiter
Designing Twitter Search
Designing Twitter Search
Quiz - Designing Twitter Search
Designing a Web Crawler
Designing a Web Crawler
Quiz - Designing a Web Crawler
Designing Facebook’s Newsfeed
Designing Facebook’s Newsfeed
Quiz - Designing Facebook’s Newsfeed
Designing Yelp or Nearby Friends
Designing Yelp or Nearby Friends
Quiz - Designing Yelp or Nearby Friends
Designing Uber backend
Designing Uber backend
Quiz - Designing Uber backend
Designing Ticketmaster
Designing Ticketmaster
Quiz - Designing Ticketmaster
Dynamo: How to design a key value store?
Dynamo: Introduction
High-Level Architecture
Data Partitioning
Replication
Vector Clocks and Conflicting Data
The Life of Dynamo’s put() & get() Operations
Anti-entropy Through Merkle Trees
Gossip Protocol
Dynamo Characteristics and Criticism
Summary: Dynamo
Quiz: Dynamo
Mock Interview: Dynamo
Designing YouTube Likes Counter (medium)
YouTube Likes Counter
Quiz
Cassandra: How to Design a Wide-column NoSQL Database?
Cassandra: Introduction
High-level Architecture
Replication
Cassandra Consistency Levels
Gossiper
Anatomy of Cassandra's Write Operation
Anatomy of Cassandra's Read Operation
Compaction
Tombstones
Summary: Cassandra
Quiz: Cassandra
Mock Interview: Cassandra
Kafka: How to Design a Distributed Messaging System?
Messaging Systems: Introduction
Kafka: Introduction
High-level Architecture
Kafka: Deep Dive
Consumer Groups
Kafka Workflow
Role of ZooKeeper
Controller Broker
Kafka Delivery Semantics
Kafka Characteristics
Summary: Kafka
Quiz: Kafka
Mock Interview: Kafka
Chubby: How to Design a Distributed Locking Service?
Chubby: Introduction
High-level Architecture
Design Rationale
How Chubby Works
File, Directories, and Handles
Locks, Sequencers, and Lock-delays
Sessions and Events
Master Election and Chubby Events
Caching
Database
Scaling Chubby
Summary: Chubby
Quiz: Chubby
Mock Interview: Chubby
HDFS: How to Design File Storage System?
Hadoop Distributed File System: Introduction
High-level Architecture
Deep Dive
Anatomy of a Read Operation
Anatomy of a Write Operation
Data Integrity & Caching
Fault Tolerance
HDFS High Availability (HA)
HDFS Characteristics
Summary: HDFS
Quiz: HDFS
Mock Interview: HDFS
GFS: How to Design a Distributed File System Storage?
Google File System: Introduction
High-level Architecture
Single Master and Large Chunk Size
Metadata
Master Operations
Anatomy of a Read Operation
Anatomy of a Write Operation
Anatomy of an Append Operation
GFS Consistency Model and Snapshotting
Fault Tolerance, High Availability, and Data Integrity
Garbage Collection
Criticism on GFS
Summary: GFS
Quiz: GFS
Mock Interview: GFS
BigTable: How to Design a Wide Column Storage System?
BigTable: Introduction
BigTable Data Model
System APIs
Partitioning and High-level Architecture
SSTable
GFS and Chubby
Bigtable Components
Working with Tablets
The Life of BigTable's Read & Write Operations
Fault Tolerance and Compaction
BigTable Refinements
BigTable Characteristics
Summary: BigTable
Quiz: BigTable
Mock Interview: BigTable
Designing Reddit (medium)
Design Reddit
Quiz
Designing Notification Service (medium)
Designing a Notification System
Quiz
Design Google Calendar (medium)
Design Google calendar (Medium)
Quiz
Design a Recommendation System (medium)
Design a Recommendation System for Netflix
Quiz
Designing Gmail (medium)
Design Gmail
Quiz
Designing Google News (medium)
Design Google News, a Global News Aggregator System (Medium)
Quiz
Designing Unique ID Generator (medium)
Design Unique ID Generator (Easy)
Quiz
Designing Code Judging System (medium)
Design Code Judging System like LeetCode (Medium)
Quiz
Designing Payment System (hard)
Design Payment System
Quiz
Designing Flash Sale System (hard)
Design a Flash Sale for an E-commerce Site (Hard)
Quiz
Designing Reminder Alert System (hard)
Design a Reminder Alert System
Quiz
System Design Patterns
Introduction: System Design Patterns
1. Bloom Filters
2. Consistent Hashing
3. Quorum
4. Leader and Follower
5. Write-ahead Log
6. Segmented Log
7. High-Water Mark
8. Lease
9. Heartbeat
10. Gossip Protocol
11. Phi Accrual Failure Detection
12. Split Brain
13. Fencing
14. Checksum
15. Vector Clocks
16. CAP Theorem
17. PACELC Theorem
18. Hinted Handoff
19. Read Repair
20. Merkle Trees
Quiz
How to Learn System Design?
This is a complete guide to learn system design and prepare for system design interviews, structured for engineers at every level. Whether you're a junior engineer facing your first system design round or a staff candidate at a top-tier company, the path is the same: master the foundations, study how patterns compose, then practice. What changes by level is how deep you go.
For levels: Junior → Mid → Senior → Staff
Time investment: 4-24 weeks (level-dependent)
00. Quick Orientation
Most "learn system design" guides target a single audience and miss everyone else. A junior engineer doesn't need to know how to design a globally distributed payment system. A staff candidate doesn't need to be told what a load balancer is. This guide acknowledges the difference: it walks through what's true at every level, then breaks out level-specific expectations so you can calibrate your prep.
A few things to know before you commit to this:
- Time investment scales with level. Junior prep can be done in 4-6 weeks of focused study. Senior prep typically takes 12-16 weeks. Staff prep is closer to ongoing development of opinions over months. The path is the same; the depth and volume of practice change.
- The interview tests reasoning, not memorization. Strong candidates at every level reason about tradeoffs given requirements. The single biggest mistake is trying to memorize "the right design for X". Interviewers detect this immediately and probe until your reasoning falls apart.
- Practice is what makes the knowledge stick. Reading is necessary but not sufficient. The plan below alternates structured reading with deliberate practice; the practice is what closes the gap between "I've heard of this" and "I can do this on a whiteboard under pressure."
- Use the level guide. Skip ahead to what each level expects if you're unsure where to calibrate. The bar at L4 is dramatically different from L6; preparing as if they're the same wastes time at one end and underprepares you at the other.
What follows: the reframe of what interviews actually test, the level-by-level expectations from junior to staff, prerequisites, the three-phase learning path, practical advice on how to practice, the most common pitfalls, curated resources, and a concrete next step.
01. What System Design Interviews Actually Test
The most important thing to understand before you start preparing: system design interviews are not testing whether you've memorized 10 architectures. They're testing whether you can reason about tradeoffs given requirements.
If you read enough prep material, you'll get the impression that there's a canonical answer for each question. "For chat, use WebSockets and at-least-once delivery." "For social feeds, use push fan-out with a hybrid for celebrities." "For dispatch, use geohashing with multi-objective scoring." These are real patterns and they show up in real designs, but treating them as the answer is the wrong mental model. The answer depends on the requirements, and interviewers deliberately vary the requirements to see whether you actually understand the tradeoffs or just memorized a canonical solution.
A more useful mental model: a system design interview is a structured conversation in which you decompose a problem into constraints, map constraints to a small set of architectural primitives (caching, sharding, queues, replication, indexing, and so on), and reason about which primitives compose into a defensible design. The primitives are the vocabulary; the requirements are the sentence; the design is the prose you produce in real time.
What interviewers actually score on
Across companies and levels, the signals interviewers consistently weight:
- Can you clarify requirements before designing? Strong candidates spend the first five minutes asking targeted questions: scope, scale, latency, consistency, edge cases. Weak candidates jump to architecture before anyone has stated the problem.
- Can you decompose at the right level of abstraction? Strong candidates produce a defensible high-level architecture (5-8 components) before drilling into any one of them. Weak candidates either get stuck at the abstract level or burn early time on a single component.
- Can you reason about tradeoffs? When asked "why this database?" strong candidates name the alternative and articulate the criterion. Weak candidates default to "because that's what's used for this kind of system."
- Can you go deep on the hardest 1-2 problems? Strong candidates protect their deep-dive time and use it on what actually matters (the celebrity problem, the consistency boundary, the failure mode). Weak candidates spread time evenly across components, going deep on nothing.
- Can you recognize and articulate failure modes? Senior and staff candidates name what would break under stress before the interviewer asks. This signals operational thinking.
Notice what's not on this list: knowing every database in existence, having built systems at FAANG scale, or having read every distributed systems paper. Interviewers assume some baseline knowledge and test reasoning on top of it. The baseline scales with level (covered below); the reasoning tests are similar across levels.
The reframe in one sentence
System design interviews test reasoning about tradeoffs under constraints. The foundations are a small set of primitives (caching, sharding, replication, queues, indexing). Patterns are compositions of those primitives. Every design you'll produce is some composition chosen to satisfy the specific requirements. Internalize this and the rest of the prep becomes obvious.
02. What Each Level Expects
The most useful single piece of prep advice: calibrate to your level. The bar at L3 is dramatically lower than L6, and over-preparing or under-preparing wastes weeks. Below, the honest picture of what each level looks like in practice. Use this to set your scope and time investment.
Junior · L3 / IC1
~0-2 years experience · 4-6 weeks prep
System design rounds are often skipped at the junior level. When they're included, they're scoped narrowly and graded gently. The goal is to confirm you've thought about how systems work beyond a single function, not to test deep distributed systems knowledge.
| Typical questions | Design a URL shortener (TinyURL). Design a simple parking lot system. Design a basic chat app. Design a todo list backend. Scope: small, often single-server or modest scale. |
| Depth expected | Client/server basics, request/response, simple database schemas, basic indexes, the idea of caching. Not expected to handle millions of users or complex sharding. |
| Signal looked for | Can you reason about a system at all? Do you understand what a database is for, what an API is, why we'd add a cache? The bar is "shows engineering judgment beyond writing functions." |
| Common pitfalls | Over-engineering. Junior candidates who design like they're at staff level get penalized for adding complexity that doesn't fit the scope. Stay simple. |
Mid-Level · L4 / IC2
~2-5 years experience · 8-12 weeks prep
System design rounds are standard at L4. Scope expands to apps with thousands to low millions of users. The bar is "you know the foundational primitives and can assemble them into a defensible design under interview time pressure."
| Typical questions | Design Twitter (basic version). Design WhatsApp (one-on-one chat). Design Instagram. Design a rate limiter. Design a notification system. Scope: medium, with explicit scale numbers in the requirements. |
| Depth expected | Foundational primitives: caching strategies, basic sharding, replication, queues, simple indexing. Familiarity with at least one canonical pattern (social feed, chat, rate limiting). Basic failure handling. |
| Signal looked for | Do you know the building blocks? Can you defend choices ("Postgres because the workload is read-heavy and we need transactions")? Can you handle one or two follow-up probes without falling apart? |
| Common pitfalls | Memorizing one design for each canonical question without understanding why. Interviewers detect this and tweak the requirements; if you can't adapt, the gap shows immediately. |
Senior · L5 / IC3
~5-10 years experience · 12-16 weeks prep
Senior interviews are where system design becomes the dominant signal. Top-tier companies often run multiple system design rounds. Scope expands to systems serving millions or hundreds of millions of users. The bar is "fluent across all canonical patterns, deep in at least one area, articulate on tradeoffs."
| Typical questions | Design Twitter (full version, including the celebrity problem). Design Uber. Design Slack or Discord. Design YouTube or Netflix. Design Dropbox. Design Yelp or Foursquare. |
| Depth expected | All canonical patterns. Deep tradeoff understanding (when push fan-out beats pull, when at-least-once is acceptable, when strong consistency is required). Strong on at least one specialized area: search, geospatial, streaming, real-time, payments, etc. |
| Signal looked for | Can you handle deliberate ambiguity? Can you go deep when probed without losing the thread? Do you name failure modes before being asked? Can you articulate "why this not that" for every major choice? |
| Common pitfalls | Going wide instead of deep. Senior candidates who try to cover every component shallowly are weaker than those who pick the hardest 1-2 problems and demonstrate real depth there. Protect your deep-dive time. |
Staff · L6+ / IC4+
~10+ years experience · ongoing development
Staff interviews shift the emphasis toward systems thinking and judgment. Pure architecture knowledge is assumed; what's tested is whether you can navigate ambiguity, set technical direction, and articulate the why behind hard calls. Questions often have no canonical answer.
| Typical questions | Often product-shaped: design fraud detection for a payments platform, design a recommendation system for a streaming service, design a compliance audit pipeline. Or specialized: design a globally consistent payment system, design a multi-region search index. |
| Depth expected | Strong opinions on tradeoffs across all dimensions: consistency, availability, durability, scalability, evolvability, operational cost. Depth in multiple specialized areas. Comfort with cross-system concerns: organizational structure, migration paths, deprecation strategies. |
| Signal looked for | Can you set technical direction? Can you make hard calls and defend them? Can you reason about second-order effects (what happens to the team's velocity, what's the migration cost, how does this evolve over five years)? Can you handle a question with no obvious right answer? |
| Common pitfalls | Treating staff interviews like senior interviews with bigger numbers. The shift is qualitative, not quantitative. Staff candidates who stay in "architecture" mode without articulating organizational and evolutionary concerns underperform. |
A note on level boundaries
Companies don't always agree on what's L4 vs L5 vs L6. The expectations above describe typical bars; specific companies will calibrate slightly differently. The most useful preparation isn't optimizing for a specific company's rubric but building genuine fluency at the level you're targeting. If you're solidly above bar at L5, you'll perform well at most companies' L5 interviews regardless of variance.
03. Prerequisites
You don't need a CS degree to learn system design, but you do need a few things in place. If any of these aren't true yet, address them first; trying to learn distributed systems while shaky on the basics is harder than it needs to be.
- You can program at a working level. Not "expert," but you can read and write code, debug failures, work with a database. You've used at least one programming language enough to feel comfortable.
- You understand HTTP at a working level. Requests and responses, status codes, headers, cookies. You don't need to know the spec by heart, but you should know what happens when you type a URL into a browser.
- You understand databases at the API level. You've used a database (any database). You know what a query is, what an index does, why some queries are fast and others are slow. You don't need to be a DBA.
- You have intuition about latency vs throughput. "How fast does one request complete" is different from "how many requests can the system handle per second." If this distinction is unfamiliar, spend a few hours with it before continuing.
- You've used a multi-server system. Even casually. You've deployed something to two machines behind a load balancer, or you've used a cloud service that you know runs across multiple regions. The intuition that "one machine isn't enough" matters.
If any of these aren't yet true, the gap is bridgeable in a week or two of focused study. The cheapest path is usually a small project: deploy a web app to two cloud servers behind a load balancer, with a database. The act of getting it working teaches you what these abstractions actually mean.
What you don't need before starting: the canonical distributed systems theory (consensus, vector clocks, FLP impossibility), formal CAP theorem proofs, or deep networking internals. These are valuable but not prerequisites. They become more interesting once you've built intuition; trying to learn them cold is slow and the returns are weak.
04. The Three-Phase Learning Path
Regardless of your level, the structure of effective prep is the same: foundations first, then composition, then practice. What changes by level is how deep you go in each phase and how much practice volume you need before you're interview-ready.
Three phases, sequential. Foundations builds the vocabulary of primitives. Composition shows how the primitives assemble into recognizable patterns. Practice is where the knowledge becomes interview-ready. Each phase has explicit graduation criteria so you know when to move on. Volume scales with target level.
Phase 1: Build foundations in the primitives
Foundations · Master the architectural primitives ~2-6 weeks (level-dependent) · ~5-10 hours/week
The goal of phase 1 is deep familiarity with the architectural primitives that every system design composes. There are roughly ten of them. They're the vocabulary you'll use in every later phase.
The primitives, in the order most useful to learn:
- Data tier first. Caching, then database selection, then sharding, then replication and consistency. The data tier is the load-bearing part of most systems; everything else depends on it.
- Traffic tier next. Load balancing, then message queues, then rate limiting. These shape how requests flow through the system.
- Operational tier third. Search and indexing, then observability. Search is its own pattern with specialized data structures; observability is what makes systems operable in production.
- Modern tier last. Vector databases. Newer pattern, only relevant for AI-augmented systems, but increasingly probed in interviews at top companies.
Level-specific scope: Junior candidates need conversational familiarity with these (you can describe what each does and when you'd reach for it). Mid-level candidates need working understanding (you can defend tradeoff choices for at least the first six). Senior candidates need fluency across all ten with deep tradeoff awareness. Staff candidates need fluency plus opinions about second-order tradeoffs and operational implications.
For each primitive, do this: read a thorough deep-dive, then explain it back to someone (or out loud, or write a one-page summary). The articulation is what transfers it from passive recognition to active fluency. If you can't explain when to use a write-through cache versus a write-behind cache without notes, you don't yet know caching well enough; reread.
Spend most of your attention on the tradeoff sections, not the implementation details. Knowing that "Cassandra is eventually consistent" matters less than knowing "this workload tolerates staleness, so we trade strict consistency for write throughput." The former is trivia; the latter is engineering.
Graduation signal — you're ready when:
- You can explain caching strategies (read-through, write-through, write-behind) and when to use each, without notes.
- You can argue when to choose Postgres versus Cassandra versus Redis for a given workload.
- You can sketch how sharding works on a whiteboard, including the hot-key problem and its mitigations.
- You can explain at-least-once delivery, idempotent consumers, and why exactly-once is hard.
- You can predict the rough behavior of a cache during a traffic spike or a database failover.
Phase 2: Study how primitives compose into patterns
Composition · Study the canonical patterns ~2-6 weeks (level-dependent) · ~5-10 hours/week
The goal of phase 2 is to see how the primitives from phase 1 compose into recognizable patterns. Most "design X" interview questions are not unique problems; they're variations on roughly ten canonical patterns. Once you've internalized the patterns, recognizing "this is a chat / messaging pattern" gives you a head start on the design.
The canonical patterns to study, ordered by interview frequency:
- Social feed (Twitter, Instagram, Facebook). Push fan-out vs pull, the celebrity problem, hybrid strategies.
- Chat / messaging (WhatsApp, Slack, Discord). Persistent connections, at-least-once delivery, group fan-out.
- Geospatial dispatch (Uber, DoorDash). Location update pipeline, geohashing, multi-objective matching.
- Video streaming (YouTube, Netflix). CDN architecture, transcoding pipelines, view counting.
- Search (Google, e-commerce search). Inverted indexes, query processing, ranking.
- E-commerce (Amazon, Shopify). Inventory management, payment processing, order pipeline.
- Collaborative editing (Google Docs, Figma). Conflict resolution, operational transforms, real-time sync.
- Notification systems (push, email at scale). Fan-out, delivery guarantees, throttling.
- API platforms (Stripe, Twilio). Rate limiting, webhooks, idempotency, audit trails.
- AI-augmented apps (RAG chatbots, AI search). Retrieval pipelines, vector databases, LLM cost management.
Level-specific scope: Junior candidates can usually skip most of phase 2; one or two patterns is enough for the questions you'll likely face. Mid-level candidates should know the first three to five patterns well. Senior candidates need fluency across all ten. Staff candidates need fluency plus the ability to recognize hybrid patterns and reason about novel compositions.
For each pattern, do this: read the framing and clarification questions, then pause and try to predict the architecture before reading the decomposition. Then check your prediction. Then read the deep-dives. After finishing, list the primitives the pattern composed and note any sub-decisions you didn't expect.
The goal isn't memorization; it's pattern recognition. By the end of phase 2, you should be able to look at a new "design X" question and immediately think: "this looks like a chat pattern with elements of social feed at the high-membership end." That recognition is what gives senior candidates their composure under interview pressure.
Graduation signal — you're ready when:
- You can name the dominant pattern of a system from a one-sentence description.
- You can identify which primitives compose into a given pattern's solution.
- You can predict where a design would break before reading the answer.
- You can explain why one pattern's decisions wouldn't fit another (push fan-out works for typical social feeds, breaks for very high follower counts; persistent connections work for chat, are overkill for low-traffic CRUD).
- You can sketch a basic architecture for any of the canonical patterns from memory.
Phase 3: Practice by designing real systems
Practice · Mock interviews and timed drills Until interview-ready · ~2-4 hours per design exercise
Phase 3 is where the knowledge becomes interview-ready. Reading is necessary but not sufficient; designing systems yourself under time pressure is what turns "I've read about this" into "I can do this with a clock running and an interviewer probing." The goal of phase 3 is to internalize the four-step framework (clarify, decompose, deep-dive, evaluate) until it feels natural.
The cycle:
- Pick a system. Start with apps you use daily. Your favorite messaging app, the food delivery service you ordered from this week, the streaming service you watched yesterday. Picking systems you understand intuitively from the user side gives you better intuition for what the requirements actually are.
- Design from scratch under a timer. On paper or whiteboard. Walk through the four-step framework. Time-box yourself: 50 minutes total, divided 10/20/60/10 across clarify, decompose, deep-dive, evaluate. The time pressure matters; it forces decisions instead of endless deliberation.
- Defend the design to a critical reader. A peer, a study partner, or in writing. Walk them through the architecture. They should ask the kinds of questions an interviewer would: "Why this database?" "What if the cache fails?" "What happens at 10x scale?" If your answers are confident and grounded in tradeoffs, the design holds. If you find yourself making things up, note where and revisit.
- Compare to reality. Look up how the actual system is built. Engineering blogs (Uber Engineering, Discord, Cloudflare, Stripe) often publish technical deep-dives. Compare your design to theirs. Where you differ, ask why; sometimes you were wrong, sometimes you were right and they made different tradeoffs based on requirements you didn't know.
- Iterate or pick a new system. Either redesign the same system with different constraints (smaller scale, different consistency requirements, etc.) or pick a new system. Variety builds breadth; depth on individual systems builds intuition.
Level-specific volume: Junior candidates are typically interview-ready after 3-5 designs. Mid-level candidates should aim for 8-12 designs across the canonical patterns. Senior candidates should target 15-20 designs with at least 3-5 mock interviews against a real interviewer. Staff candidates should target 20+ designs and several mock interviews, with focus on the product-shaped and specialized questions that show up at staff level.
Mock interviews against an experienced interviewer (rather than self-practice) are the highest-leverage activity in phase 3. The feedback from a stranger who's seen many candidates is materially different from your own self-assessment. Budget for several mock interviews as you approach phase 3 graduation.
Graduation signal — you're interview-ready when:
- You've designed enough systems end-to-end on paper that the framework feels automatic.
- You can do a full 50-minute mock interview without panicking when an unfamiliar question comes up.
- You can absorb a probing follow-up question, take 15 seconds to think, and respond with reasoning rather than memorized answers.
- You can read an engineering blog post about a real system and predict the major design decisions before the post explains them.
- You feel confident articulating the tradeoffs of any design choice without needing to look them up.
05. How to Practice Effectively
The biggest mistake at this stage is conflating reading with practicing. They're not the same. Active practice has specific techniques that turn passive knowledge into interview-ready fluency.
Active recall, not passive rereading
Reading the same primitive page three times feels productive but isn't. The thing that transfers knowledge from passive recognition to active fluency is recall: closing the page and explaining the concept in your own words. If you can articulate the tradeoffs of replication consistency without looking, you know it. If you can only nod along while reading, you don't.
The cheapest active-recall technique: after reading a primitive, write a 200-word summary in your own words. No looking back at the source. If you struggle, that's the signal you need to reread. If you can do it cleanly, you've internalized the concept.
Whiteboard or paper, not text editors
System design interviews happen on whiteboards (real or virtual). Practicing in a text editor builds different muscle memory than sketching boxes and arrows. The act of physically drawing components, thinking about where to place them, and connecting them with arrows is part of the skill being learned.
If you're remote, use a virtual whiteboard tool: Excalidraw, tldraw, or your interview platform's built-in tool. They're awkward at first; the awkwardness is part of what you need to overcome before the interview, not the night of.
Time-box your practice
Real interviews are 45-60 minutes. Practicing without a time limit is fine for early phase 1 work but counterproductive once you're in phase 3. The pressure of a clock is part of the skill; ignoring it during practice means the first time you experience it is during the interview itself.
A practical drill: set a 50-minute timer, pick a system, walk through the four-step framework. Don't pause the timer. When it goes off, stop wherever you are and review what you got done. This drill alone, repeated weekly, materially improves time discipline.
Defend designs to critical readers
The strongest practice is having someone else challenge your design. Find a study partner (other people preparing for similar interviews are easy to find online), trade designs, and ask each other hard questions. The critical reader doesn't need to be an expert; they just need to ask "why" repeatedly and refuse to accept handwaves.
If you can't find a study partner, mock interview platforms with experienced interviewers are a strong alternative. The feedback from someone who's seen many candidates is materially harder than self-review and reveals gaps you'd otherwise miss.
Mix breadth and depth
Some practice should be breadth: many different systems, designed quickly, to build pattern-recognition muscle. Some practice should be depth: one system designed multiple times under different constraints (small scale, large scale, different consistency requirements, with and without specific features). Both kinds of practice teach different things; alternate them.
The goal isn't to have read about every system. It's to have practiced designing enough systems that any new design feels like a familiar variation on patterns you already know.
06. Common Pitfalls
The mistakes that slow down most candidates. Each one is recoverable, but recognizing them early saves weeks.
Pitfall 1: Memorizing specific architectures instead of understanding tradeoffs
The temptation is to memorize "for chat, use WebSockets and Cassandra; for feed, use Kafka and Redis." This works in narrow cases but breaks the moment the interviewer changes the requirements. The signal of memorization-based knowledge: when asked "why this database?" the answer becomes "because that's what's used for this kind of system." A grounded answer would name the workload characteristics and how the chosen database serves them.
Pitfall 2: Skipping the requirements clarification step
"Design Twitter" can mean ten different systems depending on scope, scale, and edge cases. Strong candidates spend the first five minutes clarifying these. Weak candidates jump to architecture immediately. The clarify step isn't preliminaries before the real interview; it's a graded part of the interview where interviewers explicitly evaluate whether you can scope a problem before solving it.
Pitfall 3: Treating one tool as the answer to every problem
"Always use Postgres." "Kafka for everything." "Redis solves all caching." Each tool has a fit, and reaching for the same tool regardless of fit is a signal of shallow knowledge. The cure: when you find yourself defaulting to a familiar tool, force yourself to articulate the alternative and the criterion for choosing. If the criterion is "I know this one best," you're rationalizing, not designing.
Pitfall 4: Reading too much before practicing
It feels safer to "just read one more book" before designing systems yourself. Resist this. The point at which active practice becomes most valuable is when you feel slightly underprepared for it, not after you've absorbed every available resource. The discomfort of designing a system you don't fully understand is what produces the questions that drive the next round of focused reading. Forcing yourself to practice early gives reading direction.
Pitfall 5: Calibrating wrong for your level
Junior candidates over-engineer and get penalized for unnecessary complexity. Staff candidates stay in "architecture" mode without articulating organizational and evolutionary concerns. Both fail because they prepared at the wrong level. Use the level expectations section above to set scope; ask yourself "what would a strong [my level] candidate emphasize here?" before each practice session.
Pitfall 6: Going wide instead of deep at senior+ levels
Senior and staff candidates often try to cover every component of the system shallowly, hoping breadth will impress. It doesn't. Strong senior candidates pick the hardest 1-2 problems in the design and demonstrate real depth there. Going wide is a junior pattern; depth is the senior signal. Protect your deep-dive time.
Pitfall 7: Passive consumption of "design X" videos
YouTube has hundreds of "design Twitter" videos. Watching them feels productive but is one of the lowest-value forms of practice. The presenter has already done all the thinking; you're absorbing their conclusions without exercising your own reasoning. If you watch these videos, do it after you've designed the same system yourself, as a comparison. Watching first and then trying to design produces designs that are pale imitations of what you watched, not your own reasoning.
07. FAQ
I'm a junior engineer; do I really need to study system design?
It depends on the company and the specific role. Many junior interviews skip system design entirely. Where it does come up at L3, the bar is lower and the scope is narrower (URL shortener, basic chat, simple CRUD). Spending 4-6 weeks on the basics is usually sufficient. Don't over-prepare; junior candidates who design like senior engineers often get penalized for unnecessary complexity. Focus on demonstrating that you reason about systems, not that you've memorized advanced patterns you haven't used.
I'm at mid-level interviewing for senior; how should I adjust my prep?
The biggest gap between L4 and L5 is depth, not breadth. Mid-level candidates often know the canonical patterns at a surface level; senior candidates can go deep on any of them when probed. Spend extra time on phase 3 (practice) with explicit deep-dive drills: pick a pattern, then drill into one specific sub-problem (the celebrity problem in social feeds, the consistency boundary in chat, the cross-region split in dispatch) until you can talk about it for 10 minutes without notes. The senior bar is "can go three layers deeper than the surface answer."
What's the difference between senior and staff interviews?
Qualitative, not quantitative. Senior interviews ask "design X at scale" and grade architectural fluency. Staff interviews ask "design X" with deliberate ambiguity and grade systems thinking, judgment, and the ability to articulate tradeoffs across organizational and evolutionary dimensions. Staff candidates are expected to reason about second-order effects: how does this affect the team's velocity? What's the migration cost? How does this evolve over five years? Pure architecture knowledge is assumed; the test is what you do with it.
How long does prep really take at my level?
Honest ranges. Junior: 4-6 weeks of focused study. Mid-level: 8-12 weeks. Senior: 12-16 weeks, often longer if you're working full-time. Staff: ongoing, with intensified prep 3-6 weeks before specific interview cycles. Most candidates underestimate by 2-3x; the "I'll cram in two weeks" approach rarely works above junior level. The good news: progress is real even when it doesn't feel like it. By week 6 you'll notice you can read engineering blog posts at a different level.
Can I skip phase 1 if I've been working in industry?
Sometimes. Working engineers often have implicit knowledge of several primitives (you've used Postgres, you've used Redis, you've shipped to production with multiple servers) without the formal vocabulary or tradeoff awareness. The test: can you explain when to use a write-through cache versus a write-behind cache, with the exact failure modes of each? Can you describe the read-vs-write tradeoffs of replication strategies? If yes, skim phase 1 quickly to fill specific gaps. If no, do it in order.
How do I know I'm progressing if I'm self-studying?
The graduation signals at the end of each phase are designed for this. Don't move forward until you can demonstrably do everything in the graduation list. Concretely: if the phase 1 list says "you can argue when to choose Postgres versus Cassandra," try doing it out loud for 60 seconds. If you struggle, you're not done with phase 1 yet, regardless of how many primitive pages you've read. Mock interviews are the strongest signal at phase 3; if you can do 3-5 mocks at your target level with passing feedback, you're ready.
What if I'm preparing for a specific company's interview?
Companies have different interview emphases (some lean into low-level systems, some into product-shaped designs, some into specific patterns). Once you've completed phases 1 and 2, look up company-specific interview guides for the targeting; the underlying content is broadly the same, but knowing whether your interviewer will focus on consistency models versus distributed transactions versus product tradeoffs lets you weight your phase 3 practice accordingly.
How important are mock interviews?
Critical at senior and staff levels; helpful but optional at junior and mid-level. Mock interviews against an experienced interviewer surface gaps that self-practice can't: the kinds of follow-up probes interviewers use, time-pressure dynamics you don't simulate when practicing alone, and behavioral patterns under stress (do you talk too fast, freeze on hard questions, fail to clarify). Budget for 3-5 mocks before high-stakes interviews. The cost is real; the leverage is high.
08. Concrete Next Step
The biggest mistake at this point is closing this page, intending to start tomorrow, and never actually starting. Avoid that by committing to one specific action right now.
If you're at phase 1: pick the first primitive (load balancing) and study it for 30 minutes. Read a deep-dive, take a break, then explain caching tradeoffs out loud or in writing without referring back. That single 30-minute exercise is worth more than another hour of planning your study schedule.
If you're already past phase 1: pick the first canonical pattern (social feed) and work through a complete walkthrough actively. Don't read it passively; pause at each step and predict what comes next, then check yourself. That's the phase 2 study mode.
If you're at phase 3: pick a system you used today, set a 50-minute timer, and design it. WhatsApp, Spotify, DoorDash, Notion, whichever. Don't perfect; commit. Compare to reality afterward. Schedule a mock interview within the next two weeks if you haven't already.
The path is straightforward; the only thing that prevents most candidates from finishing it is starting. Start now.
Discussion
On This Page