System Design Concepts: The Ten Families That Show Up in Interviews

01How to Use This Library

The deep-dive phase of a system design interview tests one thing: do you understand a small number of concepts deeply enough to reason about failure, scale, and cost. Two concepts understood deeply beats five mentioned shallowly. Always.

This page is the routing layer for our concept library. It introduces ten concept families, organizes them into three tiers based on where they sit in a typical architecture, and links each to a dedicated deep-dive page. The deep-dive pages are where the real work happens. This page exists to help you decide where to spend that work.

Three ways to use this library, depending on your prep timeline:

Eight weeks. Read this page once to orient. Then work through the ten deep-dive guides in tier order: traffic tier, then data tier, then modern tier. Build genuine depth on three or four concept families rather than skimming all ten.
Two weeks. Read this page. Pick the four concepts you're weakest on. Read just those deep-dives. Skip the rest.
One weekend. Read this page. Read the three concepts most relevant to your target company's typical question types. Skip everything else.

For broader prep guidance see the How to Use This Guide section in the main hub.

02The Mental Model: How These Concepts Fit Together

Most concept lists are flat. You get ten things and no sense of how they relate to each other. That's a missed opportunity, because the relationships between concepts are exactly what staff loops grade for in the deep-dive phase.

Almost every distributed system you'll be asked to design has the same shape: requests come in through a traffic tier, get processed by stateless application servers, and read or write to a data tier. Asynchronous work flows between them through queues. Observability sits on top of everything. Modern systems add an AI tier alongside the data tier for embeddings, vector search, and LLM-served features.

The ten concept families map onto this shape:

The Ten Concepts, By Tier

The traffic tier shapes how requests reach the system. The data tier shapes what gets stored and how it's accessed. The modern tier reflects what's been added to the rubric in the past two years. Observability technically applies across all three.

This is the mental model that ties the concepts together. When you're asked "design X" in an interview, you're effectively being asked to make ten or so decisions across these three tiers. The tiers narrow your decision space. Knowing which concept lives where lets you reason about each component without getting overwhelmed.

The traffic tier shapes how requests reach the system. The data tier shapes what gets stored. The modern tier reflects what was added to the rubric in the past two years.

03Learning Order: Where to Start

Don't try to learn ten concepts at the same depth. Build a foundation, then go deep selectively. Three suggested learning paths, scaled to time available.

If you only have time for three concepts

Choose these. They form the foundation that every other concept depends on.

Caching. Highest-leverage concept in any system. Shows up in every read-heavy design. Read the deep-dive →
Database selection. Almost every question forces a choice here. The depth check on this one is brutal. Read the deep-dive →
Sharding. The moment scale enters the conversation, this concept enters with it. Read the deep-dive →

If you have time for five concepts

Add these to the three above for solid coverage of senior-level interviews.

Replication and consistency. The hardest deep-dive area in most loops. CAP, PACELC, leader-follower vs leaderless. Read the deep-dive →
Load balancing. Often shallow in interviews but critical when the interviewer pushes on session affinity, health checks, or geographic routing. Read the deep-dive →

If you have time for all ten

The full library. Add these to round out coverage, especially for staff-and-above loops.

Message queues and event buses. Read the deep-dive →
Rate limiting. Read the deep-dive →
Search and indexing. Read the deep-dive →
Observability. Heavily graded in 2026, often skipped by candidates. Read the deep-dive →
Vector databases. Required for any AI-adjacent question. Read the deep-dive →
LLM and AI infrastructure. Required for any AI-first company loop. Read the deep-dive →

A Note on AI Concepts

If you're targeting an AI-first company (OpenAI, Anthropic, Mistral, Cohere) or a company that has shipped meaningful AI features (Meta, Google, Microsoft), treat the modern tier concepts as required rather than optional. They are now the most differentiating round at staff and above. Section 5 of the main guide covers why.

04Traffic Tier

The traffic tier is everything between the client and the data tier. These concepts shape how requests reach your system, how the system protects itself from overload, and how asynchronous work flows between services.

Traffic & Distribution

Load Balancing

Routing requests across servers without overloading any single one.

Every distributed system starts with a load balancer in front of the application tier. The interesting questions are not "what is a load balancer" but "which algorithm and why": round-robin, least-connections, consistent hashing, weighted, or geographic. At senior level, expect to discuss session affinity, health checks, and failover behavior.

Read the deep dive on load balancing

Caching

Trading memory for latency, with cost-of-staleness as the catch.

The highest-leverage optimization in most system designs and the easiest to get wrong. The hard parts are not what to cache but when to invalidate, how to handle stampedes, and how to reason about consistency under cache failures. Write-through, write-behind, cache-aside, and refresh-ahead each have specific tradeoffs you should be able to defend.

See the full guide to caching strategies

Rate Limiting

Protecting the system from itself and from abuse.

Comes up in almost every infrastructure question and many product questions. The conceptual depth is in the algorithms (token bucket, leaky bucket, sliding window) and the operational reality (where to enforce, what to do when the limit is hit, how to differentiate legitimate spikes from abuse).

Explore rate-limiting algorithms in depth

Message Queues & Event Buses

Decoupling producers from consumers, asynchronously.

Kafka, SQS, Pub/Sub, RabbitMQ, NATS. The interview question is rarely "which one." It's "do you need ordering guarantees, what's the durability requirement, how do you handle poison messages, what's your retry strategy." Knowing when not to introduce a queue is a senior signal.

Compare message queues and event buses

What the traffic tier tests

Traffic-tier concepts are usually graded for breadth at senior level and depth at staff level. You should be able to discuss any of the four briefly without breaking a sweat. For the deep-dive phase, the most common probe areas are caching invalidation strategies and message queue ordering guarantees.

If you can only learn one traffic-tier concept deeply, learn caching. It shows up in every read-heavy product design and the failure modes (stampedes, invalidation correctness, cache inconsistency under failure) are the kind of thing that separates "I've read about it" from "I've operated it."

05Data Tier

The data tier is where state lives. These concepts shape what gets stored, how it's distributed, what consistency model you commit to, and how it's queried. The data tier is where most senior-level deep dives end up, because the interesting failure modes and tradeoffs cluster here.

Data & State

Database Selection

Choosing the right store for the data shape, not the brand name.

SQL versus NoSQL is not the question anymore. The real question is: what's the access pattern, what's the consistency requirement, and what's the failure tolerance. Postgres, DynamoDB, Cassandra, Spanner, MongoDB, ScyllaDB, and others each have specific use cases. You should be able to pick one and defend the choice in three sentences.

Walk through database selection tradeoffs

Sharding & Partitioning

Splitting data across machines without breaking the application.

Once data exceeds a single machine's capacity, you shard. The hard problems are choosing the shard key (range, hash, geographic, tenant), handling resharding without downtime, and dealing with hot keys. Consistent hashing shows up in almost every loop that touches scale.

Read about sharding strategies and pitfalls

Replication & Consistency

Making data available everywhere while keeping it correct.

Synchronous versus asynchronous replication. Leader-follower versus multi-leader versus leaderless. CAP and PACELC. Strong, eventual, and bounded staleness consistency models. This is the area where most candidates know the names but cannot reason about the tradeoffs. Practice this one specifically.

Dig into replication and consistency models

Search & Indexing

Finding what users want, fast, at scale.

Full-text search, inverted indexes, Elasticsearch and OpenSearch architecture, ranking signals. In 2026, this concept now overlaps heavily with vector search for semantic queries; expect interviewers to push you toward hybrid (keyword plus semantic) approaches when the question involves any kind of recommendation or search-with-ranking.

Understand search systems and inverted indexes

What the data tier tests

The data tier is where most loops are decided. The depth check on these four concepts is significantly harder than on the traffic tier. Interviewers can push deeper on database selection (every choice opens new tradeoff territory) and on replication and consistency (where most candidates have surface-level knowledge but stumble on specifics).

If you can only learn two data-tier concepts deeply, learn database selection and sharding. They cover the most ground and the depth probes on both are predictable. Replication and consistency is the third you should aim for; search and indexing has narrower applicability but matters for any recommendation or feed-shaped question.

06Modern Tier

The modern tier is what was added to the system design rubric between 2023 and 2026. These concepts barely existed in pre-2024 prep material. They are now central. If you're prepping with material from before 2024, this is the section that's most likely to be missing.

Modern Tier · New in 2026 Rubrics

Observability 2026

The fourth pillar that became required at senior level.

Metrics, logs, traces, alerts. The shift in 2026 is that observability is no longer a bonus topic. If you finish a 45-minute design without addressing how the system gets monitored, where logs go, how on-call engineers will debug it, you've left explicit rubric points on the table. Treat observability as a first-class component of any design.

Build observability into system designs

Vector Databases 2026

Storing and querying embeddings for AI-era systems.

pgvector, Pinecone, Weaviate, Milvus, ScaNN. Vector databases store high-dimensional embeddings and support similarity search rather than exact match. They show up in any question involving recommendations, semantic search, RAG, or anomaly detection. Understand the index types (HNSW, IVF, PQ) and the recall-latency tradeoffs.

Get up to speed on vector databases

LLM & AI Infrastructure 2026

Serving generative models in production systems.

LLM gateways, semantic caching, prompt-template management, retrieval-augmented generation flows, embedding pipelines, token budgets, latency strategies, fallback handling. This category did not exist in pre-2024 system design rubrics. It is now central. Even when the question is not AI-specific, surfacing relevant AI considerations signals current-era awareness.

Learn AI infrastructure for system design

What the modern tier tests

The modern tier is where most candidates are weakest, because most prep material doesn't cover it. The good news: the bar is lower here than on traffic and data, because interviewers know the concepts are new. Surfacing them at all already differentiates you from the candidate who never mentions them.

The bar is rising fast though. By late 2026, expect the modern tier to be graded at parity with traffic and data. If you're prepping for a loop in the next twelve months, treat these concepts as required, not optional.

Observability is the easiest of the three to internalize and the highest-leverage to mention in interviews, because it can be added to any design without restructuring it. Vector databases and LLM infrastructure require you to be answering an AI-adjacent question for them to show up.

07How These Concepts Interact

Concepts don't exist in isolation. The reason staff loops grade so heavily on cross-system reasoning is that real architectures involve constant tension between concepts. Here are the most important interactions you'll encounter in interviews.

Caching×Database Selection

The right cache strategy depends heavily on the database underneath. A read-replica-heavy Postgres setup needs different caching than a Cassandra cluster with eventual consistency. The naive answer ("add Redis in front") collapses if the interviewer pushes on consistency or invalidation.

Sharding×Replication

You can shard then replicate, replicate then shard, or do both. Each combination has different failure modes. The interaction is often the depth-check question at staff level: "if a shard fails, what does your replication strategy do, and how does the application know?"

Rate Limiting×Load Balancing

Where you enforce rate limits depends on your load balancing strategy. If you're using consistent hashing, you can rate-limit per shard. If you're using round-robin, you need a centralized rate limiter. The choice cascades through the architecture.

Search×Vector Databases

In 2026, search systems often do hybrid retrieval: keyword search through an inverted index plus semantic search through a vector store, with ranking that combines both. Treating them as separate concepts is increasingly outdated.

Message Queues×Replication

Queue durability guarantees interact with database replication strategy. If your queue commits before the database has replicated, you can lose work on a region failure. The interesting questions are about the boundaries between systems, not within them.

Observability×Everything

Observability is the connective tissue. Every concept in this library has its own observability requirements: cache hit rate, replication lag, queue depth, rate limit rejections. Strong candidates name the metrics they'd want for each component as they design it.

These six interactions are not exhaustive. Every pair of concepts has some interaction. The point is not to memorize them. The point is to recognize that when an interviewer asks a follow-up question that crosses concept boundaries, they're testing exactly this: can you reason about the system as a system, not as a list of components.

08How Concepts Show Up in Interviews

The concepts in this library don't appear by name in interview questions. Nobody asks "tell me about caching." They appear implicitly: a question shape forces certain concepts into the conversation, and your job is to recognize the forced concepts, raise them, and reason about them.

Three patterns to internalize:

Read-heavy product designs force traffic-tier concepts

Twitter, Instagram, YouTube, news feeds. Read-heavy means caching, load balancing, and timeline assembly become the load-bearing decisions. The data tier still matters but is secondary. You should expect caching to be a deep-dive area in any read-heavy product design.

Infrastructure questions force data-tier concepts

Design a key-value store. Design a rate limiter. Design a distributed cache. These questions strip away the product-tier concerns and force the conversation directly to consistency, replication, sharding, and partitioning. Expect the deep dive to land on consistency models and shard-key choice.

AI-adjacent questions force modern-tier concepts

Design a RAG service. Design a recommendation feed with LLM-generated summaries. Design a vector search service. These questions require modern-tier fluency. Expect deep dives on token budgets, LLM fallback strategies, vector index choice, and the latency budget across the AI tier.

For the full mapping of question types to concept families, see the question patterns section in the main guide. The framework page covers how to handle each question type using the four-step framework.

09How to Practice With This Library

Reading concept pages without practicing is the most common failure mode. Concepts internalize through use, not through recall. Three concrete practice patterns:

Concept-to-question. Pick a concept. Pick a question that forces it. Run through the question with the framework, deliberately going deep on that concept. Then pick the same concept and a different question. Repeat. You're learning to recognize where the concept applies, not just what it is.
Question-to-concept. Pick a question. Without solving it, list which concepts the question forces. Why those? Where does each show up in the design? Where would a deep dive land? This trains the recognition skill that makes the deep-dive phase feel natural.
Failure-mode walks. For each concept you've studied, write down three failure modes and what the runbook would look like. This is the operational maturity signal that 2026 senior loops grade explicitly. The act of writing the runbook turns conceptual knowledge into operational fluency.

The third practice pattern is the most underrated. Most candidates can describe a concept. Far fewer can describe what the concept looks like at 3am when it's broken. The latter is what gets you hired.

Most candidates can describe a concept. Far fewer can describe what the concept looks like at 3am when it's broken. The latter is what gets you hired.

Continue

Question Pattern Walkthroughs →

Once you've built depth on the concepts, see them in action across the four major question categories: classic product designs, infrastructure questions, AI-adjacent questions, and correctness-and-operational designs.