System Design

Learn System Design

How to Learn System Design?

Functional vs. Non-functional Requirements

What are Back-of-the-Envelope Estimations?

Things to Avoid During System Design Interview

System Design Basics

Introduction to Load Balancing

Load Balancing Algorithms

Uses of Load Balancing

Load Balancer Types

Stateless vs. Stateful Load Balancing

High Availability and Fault Tolerance

Scalability and Performance

Challenges of Load Balancers

Introduction to API Gateway

Usage of API gateway

Advantages and disadvantages of using API gateway

Scalability

Availability

Latency and Performance

Concurrency and Coordination

Monitoring and Observability

Resilience and Error Handling

Fault Tolerance vs. High Availability

HTTP vs. HTTPS

TCP vs. UDP

HTTP: 1.0 vs. 1.1 vs 2.0 vs. 3.0

URL vs. URI vs. URN

Introduction to DNS

DNS Resolution Process

DNS Load Balancing and High Availability

Introduction to Caching

Why is Caching Important?

Types of Caching

Cache Replacement Policies

Cache Invalidation

Cache Read Strategies

Cache Coherence and Consistency Models

Caching Challenges

Cache Performance Metrics

What is CDN?

Origin Server vs. Edge Server

CDN Architecture

Push CDN vs. Pull CDN

Introduction to Data Partitioning

Partitioning Methods

Data Sharding Techniques

Benefits of Data Partitioning

Common Problems Associated with Data Partitioning

What is a Proxy Server?

Uses of Proxies

VPN vs. Proxy Server

What is Redundancy?

What is Replication?

Replication Methods

Data Backup vs. Disaster Recovery

Introduction to CAP Theorem

Components of CAP Theorem

Trade-offs in CAP Theorem

Examples of CAP Theorem in Practice

Beyond CAP Theorem

System Design Trade-offs in Interviews

Introduction to Databases

SQL Databases

NoSQL Databases

SQL vs. NoSQL

ACID vs BASE Properties

Real-World Examples and Case Studies

SQL Normalization and Denormalization

In-Memory Database vs. On-Disk Database

Data Replication vs. Data Mirroring

Database Federation

What are Indexes?

Types of Indexes

Introduction to Bloom Filters

Benefits & Limitations of Bloom Filters

Variants and Extensions of Bloom Filters

Applications of Bloom Filters

Difference Between Long-Polling, WebSockets, and Server-Sent Events

Why Quorum?

What is Quorum?

What is Heartbeat?

What is Checksum?

Uses of Checksum

What is Leader and Follower Pattern?

What is Security and Privacy?

What is Authentication?

What is Authorization?

Authentication vs. Authorization

OAuth vs. JWT for Authentication

What is Encryption?

What are DDoS Attacks?

Introduction to Messaging System

Introduction to Kafka

Messaging patterns

Popular Messaging Queue Systems

RabbitMQ vs. Kafka vs. ActiveMQ

Scalability and Performance

What is a Distributed File System?

Architecture of a Distributed File System

Key Components of a DFS

Batch Processing vs. Stream Processing

XML vs. JSON

Synchronous vs. Asynchronous Communication

Push vs. Pull Notification Systems

Microservices vs. Serverless Architecture

Message Queues vs. Service Bus

Stateful vs. Stateless Architecture

Event-Driven vs. Polling Architecture

Quiz

System Design Trade-offs

Importance of Discussing Trade-offs

Strong vs Eventual Consistency

Latency vs Throughput

ACID vs BASE Properties in Databases

Read-Through vs Write-Through Cache

Batch Processing vs Stream Processing

Load Balancer vs. API Gateway

API Gateway vs Direct Service Exposure

Proxy vs. Reverse Proxy

API Gateway vs. Reverse Proxy

SQL vs. NoSQL

Primary-Replica vs Peer-to-Peer Replication

Data Compression vs Data Deduplication

Server-Side Caching vs Client-Side Caching

REST vs RPC

Polling vs. Long-Polling vs. WebSockets vs. Webhooks

CDN Usage vs Direct Server Serving

Serverless Architecture vs Traditional Server-based

Stateful vs Stateless Architecture

Hybrid Cloud Storage vs All-Cloud Storage

Token Bucket vs Leaky Bucket

Read Heavy vs Write Heavy System

Quiz

System Design Interviews - A step by step guide

System Design Master Template

Designing a URL Shortening Service like TinyURL

Quiz - Designing URL Shortner

Designing Pastebin

Quiz - Designing Pastebin

Designing Instagram

Quiz - Designing Instagram

Designing Dropbox

Quiz - Designing Dropbox

Designing Facebook Messenger

Quiz - Designing Facebook Messenger

Designing Twitter

Quiz - Designing Twitter

Designing Youtube or Netflix

Quiz - Designing Youtube

Designing Typeahead Suggestion

Quiz - Designing Typeahead Suggestion

Designing an API Rate Limiter

Quiz - Designing an API Rate Limiter

Designing Twitter Search

Quiz - Designing Twitter Search

Designing a Web Crawler

Quiz - Designing a Web Crawler

Designing Facebook’s Newsfeed

Quiz - Designing Facebook’s Newsfeed

Designing Yelp or Nearby Friends

Quiz - Designing Yelp or Nearby Friends

Designing Uber backend

Quiz - Designing Uber backend

Designing Ticketmaster

Quiz - Designing Ticketmaster

Dynamo: Introduction

High-Level Architecture

Data Partitioning

Replication

Vector Clocks and Conflicting Data

The Life of Dynamo’s put() & get() Operations

Anti-entropy Through Merkle Trees

Gossip Protocol

Dynamo Characteristics and Criticism

Summary: Dynamo

Quiz: Dynamo

Mock Interview: Dynamo

YouTube Likes Counter

Quiz

Cassandra: Introduction

High-level Architecture

Replication

Cassandra Consistency Levels

Gossiper

Anatomy of Cassandra's Write Operation

Anatomy of Cassandra's Read Operation

Compaction

Tombstones

Summary: Cassandra

Quiz: Cassandra

Mock Interview: Cassandra

Messaging Systems: Introduction

Kafka: Introduction

High-level Architecture

Kafka: Deep Dive

Consumer Groups

Kafka Workflow

Role of ZooKeeper

Controller Broker

Kafka Delivery Semantics

Kafka Characteristics

Summary: Kafka

Quiz: Kafka

Mock Interview: Kafka

Chubby: Introduction

High-level Architecture

Design Rationale

How Chubby Works

File, Directories, and Handles

Locks, Sequencers, and Lock-delays

Sessions and Events

Master Election and Chubby Events

Caching

Database

Scaling Chubby

Summary: Chubby

Quiz: Chubby

Mock Interview: Chubby

Hadoop Distributed File System: Introduction

High-level Architecture

Deep Dive

Anatomy of a Read Operation

Anatomy of a Write Operation

Data Integrity & Caching

Fault Tolerance

HDFS High Availability (HA)

HDFS Characteristics

Summary: HDFS

Quiz: HDFS

Mock Interview: HDFS

Google File System: Introduction

High-level Architecture

Single Master and Large Chunk Size

Metadata

Master Operations

Anatomy of a Read Operation

Anatomy of a Write Operation

Anatomy of an Append Operation

GFS Consistency Model and Snapshotting

Fault Tolerance, High Availability, and Data Integrity

Garbage Collection

Criticism on GFS

Summary: GFS

Quiz: GFS

Mock Interview: GFS

BigTable: Introduction

BigTable Data Model

System APIs

Partitioning and High-level Architecture

SSTable

GFS and Chubby

Bigtable Components

Working with Tablets

The Life of BigTable's Read & Write Operations

Fault Tolerance and Compaction

BigTable Refinements

BigTable Characteristics

Summary: BigTable

Quiz: BigTable

Mock Interview: BigTable

Design Reddit

Quiz

Designing a Notification System

Quiz

Design Google calendar (Medium)

Quiz

Design a Recommendation System for Netflix

Quiz

Design Gmail

Quiz

Design Google News, a Global News Aggregator System (Medium)

Quiz

Design Unique ID Generator (Easy)

Quiz

Design Code Judging System like LeetCode (Medium)

Quiz

Design Payment System

Quiz

Design a Flash Sale for an E-commerce Site (Hard)

Quiz

Design a Reminder Alert System

Quiz

Introduction: System Design Patterns

1. Bloom Filters

2. Consistent Hashing

3. Quorum

4. Leader and Follower

5. Write-ahead Log

6. Segmented Log

7. High-Water Mark

8. Lease

9. Heartbeat

10. Gossip Protocol

11. Phi Accrual Failure Detection

12. Split Brain

13. Fencing

14. Checksum

15. Vector Clocks

16. CAP Theorem

17. PACELC Theorem

18. Hinted Handoff

19. Read Repair

20. Merkle Trees

Quiz

Strong vs Eventual Consistency

Strong vs Eventual Consistency

consistency

strong consistency

eventual consistency

distributed systems

+3

hard
·
21 min
·Updated Mar 2025

Introduction

In distributed systems and databases, data consistency refers to the property that all copies of data (on different servers or nodes) reflect the same state. When multiple users or services access data spread across servers, we want them all to see the same values. Consistency is important because if one part of the system has outdated information while another has new information, it can lead to errors or confusion. For example, imagine an online store with inventory data replicated across servers: if one server thinks an item is in stock and another thinks it’s sold out, you might accidentally sell a product that isn’t actually available. Ensuring consistency means avoiding such conflicts by keeping data synchronized across the system. However, achieving perfect consistency in a distributed environment can be challenging – especially as systems grow and need to stay fast and available. This is where different consistency models come into play, the most common being strong consistency and eventual consistency.

What is Strong Consistency?

Strong consistency (sometimes called strict consistency or linearizability) guarantees that any read of data will return the most recent write. In other words, once a write operation completes, all subsequent reads (from any user or any location) will see that update. There are no “stale” (out-of-date) reads under strong consistency – every read gets the latest committed data. To achieve this, systems often use synchronous replication: when data is written, it is simultaneously updated on multiple nodes, and the write is only confirmed when all nodes (or a majority, depending on the design) have the new data. This ensures that no matter which node a read comes from, the data is up-to-date.

Image
Simplified illustration of strong consistency

In a strongly consistent system, a write operation is propagated to all replicas before it is considered successful. The above diagram shows a user writing a value to the US West database node, which then starts a transaction to update the US East and Europe nodes. The client only gets an “OK” after the data is committed everywhere. During this process, reads to that data are blocked on other nodes (and even on the writer node until commit) to avoid seeing partial updates. This adds some delay (e.g., the network latency between regions) and can cause concurrent writes to be serialized or one to fail (as shown by the second user’s write being blocked). The benefit is that any user reading after the commit will get the latest value, ensuring accuracy and eliminating stale data.

Because every node agrees on the data before moving on, strong consistency gives a very intuitive guarantee: it feels like there is a single up-to-date copy of the data that everyone is reading from. The classic analogy is a banking transaction. When you withdraw money from an ATM, the banking system should update your account balance immediately on all servers. If you had $500 and withdrew $100, any bank teller or ATM anywhere should now see $400, not the old $500. A strongly consistent system ensures that once your withdrawal is processed, all future views of your balance reflect the withdrawal. This prevents issues like spending the same money twice or overdrawing due to inconsistent information.

From a technical perspective, many traditional databases and some modern distributed databases provide strong consistency. For example, Google Spanner (a globally distributed SQL database) is designed so that reads across data centers still return the most up-to-date data. In a strongly consistent setup with a primary/leader database and replicas, a common approach is that all writes go through the primary and are synchronously replicated to secondaries. Only after the replicas acknowledge the update will the system consider the transaction committed. Thus, any read (often reads are either served by the primary or from replicas that have applied all updates) will see the latest data. This guarantee simplifies reasoning about data – developers don’t have to worry about reading stale information – but it comes with a cost.

Trade-offs of Strong Consistency: The main downside of strong consistency is latency and availability. Because a write must be confirmed by multiple nodes, it typically takes longer to complete. If you have servers in New York and London and you update a record, you might have to wait for the update to travel across the ocean and back before the user gets confirmation. This means writes (and sometimes reads) are slower compared to a system that doesn’t wait on multiple replicas. In practice, strong consistency sacrifices some performance and potentially availability for the sake of accuracy. If one of the replicas is down or there is a network partition, a strongly consistent system might block updates (or even reads) rather than risk inconsistency. For instance, if the network link to one data center is lost, the system may stop accepting writes to maintain consistency (trading availability for consistency). In summary, with strong consistency you get absolute correctness at the expense of response time and sometimes system availability during failures. This model is often chosen for systems where correctness is critical – we’ll discuss use cases in a moment.

What is Eventual Consistency?

Eventual consistency is a looser model of consistency that guarantees that if no new updates are made to a piece of data, all copies of that data will eventually become consistent (the same). The key word here is “eventually.” It doesn’t promise immediate consistency, only that given enough time (and no more changes), all nodes will converge to the latest value. In an eventually consistent system, it’s possible to read stale data temporarily, but over time all the nodes catch up to reflect the last write.

How does this work in practice? Eventual consistency systems use asynchronous replication. When you write to one node, that node will apply the update, and then propagate the change to other nodes in the background, but it won’t wait for all the others to acknowledge the update before reporting success to the client. The write is considered successful as soon as it’s on that one node (sometimes maybe a couple, but not necessarily all), and other replicas will get the update shortly after. This means if you immediately read from another replica, you might not see the update – you’d get an older value. The system is okay with that, because it assumes that under the hood, the replicas are exchanging updates and will eventually sync up. As one article put it, eventual consistency is an optimistic approach: it responds quickly to operations, at the risk of returning stale data to some readers in the short term. You trade instantaneous accuracy for speed and partition tolerance. The upside is that performance is typically much better – reads and writes are fast since they often hit a local node and don’t have to wait for global coordination.

Image
Simplified illustration of eventual consistency

In an eventually consistent system, a write is confirmed without waiting for all replicas. The above diagram shows a user writing a value to the US West database, which immediately returns “OK” after storing the new value locally. That update (x = 10 in this example) is then asynchronously replicated to other regions (US East, Europe) over time. During that time, another user reading from the Europe database might get the old value (stale data) until the update arrives. The note “Data replication is asynchronous” highlights that the changes flow in the background. Eventually, all users in every region will see x = 10, but there is a window where different users see different data.

A real-world analogy for eventual consistency is a social media timeline or news feed. When you post a new photo, not all of your friends might see it at the exact same moment. Perhaps the data center serving your friend hasn’t received your update yet, so for a short time, one friend’s app still shows an old timeline (missing your photo) while others see the new post. After a little while (maybe a few seconds or more), everyone’s feed catches up and shows the same latest posts. That’s eventual consistency: some delay is acceptable as long as the system becomes consistent later. The system prioritizes availability and responsiveness – you were able to post quickly, and your friend could continue browsing (even if it was an outdated view) – over immediate synchronization. Many large-scale web systems work this way. For example, social networks and content distribution networks prefer to deliver content quickly from nearby servers even if it might be slightly out-of-date, rather than always waiting on a central source. Another everyday example is DNS (Domain Name System). When a website’s IP address changes, that update takes time to propagate to DNS servers around the world. During that propagation, some users might still be directed to the old IP (stale data), but eventually all DNS servers get the new record. This DNS behavior is eventually consistent by design.

In databases, many NoSQL and distributed databases use eventual consistency to achieve high performance and partition tolerance. For instance, Amazon’s DynamoDB and Apache Cassandra default to eventual consistency, meaning they do not guarantee that a read will see the very latest write unless you explicitly ask for stronger consistency. These systems allow temporary inconsistencies in exchange for being highly available and scalable across many nodes and regions. Under the hood, they handle conflict resolution (for example, if two updates occur on different replicas concurrently, they might use strategies like "last write wins" or merge the changes). The expectation is that conflicts are either rare or tolerable, and the system will resolve them to converge to a single state eventually. The big advantage is that even if some nodes are down or slow, the system can still accept writes and reads on the others, making it fault-tolerant and fast. The downside is, as noted, you might read data that isn’t the absolutely latest value if you happen to hit a lagging replica.

Trade-offs of Eventual Consistency: The mantra for eventual consistency is “fast and available, but maybe not up-to-date.” Clients get responses quickly, and the system remains operational even during network partitions or node failures (each node can operate somewhat independently). However, the cost is temporary inconsistency – different users might see different data for a short time. If your application can tolerate that (for example, showing yesterday’s count of Likes on a post instead of the one from a minute ago), then eventual consistency can dramatically improve throughput and fault tolerance. But if your application cannot tolerate any divergence in views (e.g. an account balance or a critical configuration setting), eventual consistency might cause problems. In short, eventual consistency favors availability, partition tolerance, and low latency over immediate accuracy. It's a good choice when speed and uptime matter more than perfect synchronization at every moment.

Comparison: Strong vs. Eventual Consistency

Now that we’ve defined both models, let’s compare strong and eventual consistency side by side. Each model has its own guarantees and ideal use cases. Below is a summary of key differences:

<div style="width:100px">Aspect</div>Strong ConsistencyEventual Consistency
Data GuaranteeEvery read gets the latest write (no stale data). The system behaves like a single up-to-date copy of data.Reads might be stale right after a write; data updates propagate over time until all nodes sync up. Eventually, all copies become consistent.
Read/Write LatencyHigher latency – operations are slower because they wait for coordination (e.g., confirmations from other nodes). Users might notice a slight delay on writes or reads that require global agreement.Lower latency – operations are fast because they don’t wait on worldwide consensus. The system responds quickly using local data, at the risk of returning outdated info.
AvailabilityCan be reduced availability in the face of network issues. If some replicas or a network link are down, the system may refuse reads/writes (better to be unavailable than inconsistent).High availability – the system can operate even if some nodes are down or disconnected. Each node can accept writes and reads independently (but might serve old data until reconnection).
Consistency LevelStrong (strict) – absolute consistency at all times. Think “all-or-nothing” for updates: either everyone sees it or no one does until it's ready.Weak (Eventually Strong) – allows temporary inconsistency. Think “sooner or later everyone will see it, but maybe not instantly.”
Use CasesCritical data where accuracy is paramount: financial transactions, banking systems, inventory management, booking systems, and other scenarios where an outdated read could cause serious problems. Also common in traditional SQL databases and strongly consistent distributed systems (e.g., Spanner, etcd).High-volume or globally distributed systems where speed and uptime matter more than immediate consistency: social media feeds, messaging apps, content delivery networks (CDNs), cache systems, analytics data, and some e-commerce features (like product view counts or recommendations). Often used by NoSQL databases (Cassandra, DynamoDB, etc.) under the BASE philosophy (Basically Available, Soft state, Eventual consistency).

Note: These are the two extremes of a spectrum. In between, there are many intermediate consistency models (like “read-your-writes”, “causal consistency”, “bounded staleness”, etc.) that provide trade-offs between the strictness of strong consistency and the freedom of eventual consistency. Some databases (for example, Azure Cosmos DB or MongoDB) let you choose from multiple consistency levels depending on your needs. But for simplicity, strong vs. eventual are useful mental anchors for the ends of the spectrum.

When to Use Which?

Choosing the right consistency model depends on your application’s requirements and what trade-offs you’re willing to make. Here are some guidelines to help decide:

  • Use Strong Consistency when correctness is critical: If your application cannot tolerate any divergence or stale reads, strong consistency is the way to go. This is often true for financial systems (banking, stock trading, payment processing) where every operation must see the latest state to prevent errors or fraud. For example, in an online banking app or an e-commerce checkout system, it’s crucial that once a payment is made or an item is sold, all parts of the system immediately reflect that change. Strong consistency is also important for things like inventory management (to avoid selling the last item twice), reservation/booking systems (to avoid double-booking a hotel room or airline seat), or any scenario involving balances, counts, or one-time use codes. If your users expect that once they perform an action, everyone (and every service) will see the result instantly and reliably, lean towards strong consistency. Keep in mind you may need to invest in more robust infrastructure (and possibly accept lower throughput) to achieve this level of consistency.

  • Use Eventual Consistency when you need high availability, low latency, and can tolerate slight delays in synchronization: For many applications, it’s okay if data is not perfectly in sync at every moment. If showing slightly out-of-date information won’t break the user experience or business logic, eventual consistency can dramatically improve performance and resilience. Social networks are a prime example: it's better to allow users to post and interact quickly than to pause the whole system to make sure everyone’s view is perfectly updated in real-time. Minor inconsistencies (like a friend count that updates a few seconds later) are acceptable in that context. Content delivery and Caching systems also prefer eventual consistency — it’s more important that content is delivered fast from a nearby server than it is to be absolutely up-to-the-moment (think of how web caches might serve slightly old content to avoid slow fetches). Messaging apps and collaborative platforms often use eventual consistency under the hood for things like delivering messages or updates; as long as messages arrive and eventually order themselves correctly, the system can afford slight reordering or delays in reflecting message states (though often they add extra logic to handle ordering). If your application runs across multiple data centers or needs to stay available even when parts of the network fail, eventual consistency is a robust choice. It will keep working (with some delays) rather than shutting down. As a rule of thumb, ask: “Is it okay if two different users see a different state of data for a brief time?” If yes, eventual consistency might be suitable and will give you better latency and partition tolerance.

  • Consider a mix or tunable approach: In many real-world systems, it’s not an all-or-nothing decision. You might identify certain operations or data that require strong consistency and others that can be eventually consistent. For instance, a shopping website might use strong consistency for processing payments and inventory updates, but eventual consistency for showing product recommendations or the number of items left in stock (since that count could be a few seconds stale without harm). Some databases offer tunable consistency, where each read or write can choose a consistency level. For example, you could do a strongly consistent read for a critical piece of data and eventual (faster) reads for less critical data. Designing your system with this in mind can give you a good balance: strong consistency where absolutely needed, eventual where it’s safe. As one source notes, the choice “obviously depends on the type of application” – use the model that fits your needs for accuracy vs. performance. It’s a business decision as much as a technical one: decide what matters more for each part of your system (freshness or speed) and choose accordingly.

Conclusion

Strong consistency offers a guarantee that everyone sees the same data at the same time, which simplifies development and ensures correctness – at the cost of speed and sometimes availability. Eventual consistency relaxes the rules, allowing temporary differences in data across the system, which boosts performance, scalability, and fault tolerance – at the cost of requiring the application to tolerate out-of-sync data for a little while. There is no one “better” model; it truly comes down to what your application needs. If you’re building something like a banking ledger or a system where accuracy is non-negotiable, strong consistency is worth the trade-off. If you’re building a social app, a content feed, or a service that must stay up no matter what, eventual consistency is often the pragmatic choice.

When designing your system, carefully evaluate how critical each piece of data is. Ask questions like: What’s the impact if a user sees stale data for a moment? Will it harm the experience or business? If yes, you should lean towards a strongly consistent approach for that data. If no, you can reap the benefits of eventual consistency. Often, the best architectures use a combination of both, applying strong consistency for core transactions and eventual consistency for derived or less critical data.

Mark as read
PreviousImportance of Discussing Trade-offs
NextLatency vs Throughput
Discussion
Have a question or insight about this topic? Share it with the community.
Reading Progress
0%

On This Page