On this page
How CRDTs Work: The Core Idea
Commutative
Associative
Idempotent
Two Approaches
The Four Essential CRDT Types
G-Counter (Grow-Only Counter)
PN-Counter (Positive-Negative Counter)
G-Set (Grow-Only Set)
OR-Set (Observed-Remove Set)
LWW-Register (Last-Writer-Wins Register)
Delta-Based CRDTs: The Practical Middle Ground
Where CRDTs Are Used in Production
Trade-Offs and Limitations
CRDTs vs Other Approaches
CRDTs vs Consensus (Paxos/Raft)
CRDTs vs Last-Writer-Wins (LWW)
CRDTs vs Operational Transformation (OT)
How This Shows Up in System Design Interviews
Common Mistakes
Conclusion: Key Takeaways
CRDTs Explained: Conflict-Free Replicated Data Types for Eventual Consistency


On This Page
How CRDTs Work: The Core Idea
Commutative
Associative
Idempotent
Two Approaches
The Four Essential CRDT Types
G-Counter (Grow-Only Counter)
PN-Counter (Positive-Negative Counter)
G-Set (Grow-Only Set)
OR-Set (Observed-Remove Set)
LWW-Register (Last-Writer-Wins Register)
Delta-Based CRDTs: The Practical Middle Ground
Where CRDTs Are Used in Production
Trade-Offs and Limitations
CRDTs vs Other Approaches
CRDTs vs Consensus (Paxos/Raft)
CRDTs vs Last-Writer-Wins (LWW)
CRDTs vs Operational Transformation (OT)
How This Shows Up in System Design Interviews
Common Mistakes
Conclusion: Key Takeaways
What This Blog Covers
- What CRDTs are and the problem they solve
- State-based vs operation-based CRDTs
- The four essential CRDT data types with concrete examples
- Where CRDTs are used in production
- Trade-offs, limitations, and how to discuss CRDTs in system design interviews
Two users are editing a shared shopping list.
User A adds "milk" on their phone while offline at the grocery store.
User B adds "eggs" on their laptop at home.
When both devices reconnect, the system needs to merge these changes.
The correct result is a list containing both "milk" and "eggs."
This is trivial if you have a single server that serializes all writes. But what if the system is replicated across multiple regions and both users wrote to different replicas? What if a network partition separated the replicas while the writes happened? You need a conflict resolution strategy.
The traditional approaches have well-known problems.
Last-writer-wins picks one version and discards the other, so either "milk" or "eggs" is silently lost.
Manual conflict resolution surfaces both versions to the user and asks them to choose, which is disruptive and confusing.
Consensus (Paxos, Raft) provides strong consistency but blocks writes during partitions, sacrificing availability.
CRDTs take a radically different approach. Instead of preventing concurrent writes or resolving conflicts after the fact, CRDTs are data structures that are mathematically guaranteed to merge without conflicts.
Any two replicas of a CRDT, no matter how they diverged, can always merge into a consistent state without coordination and without data loss.
The merge is automatic, deterministic, and requires no special conflict resolution code.
This guarantee is called strong eventual consistency (SEC): replicas that have received the same set of updates will be in the same state, regardless of the order in which they received those updates. SEC is stronger than plain eventual consistency because it guarantees convergence as soon as the updates are delivered, not "eventually at some unspecified future time."
For understanding why the distinction between eventual consistency and strong eventual consistency matters, Why "Eventually Consistent" Is the Most Dangerous Phrase in System Design Interviews covers the pitfalls of imprecise consistency reasoning.
How CRDTs Work: The Core Idea
The mathematical property that makes CRDTs work is that their merge operation forms a semilattice.
In practical terms, this means the merge function has three properties:
Commutative
merge(A, B) = merge(B, A). The order in which you merge two states does not matter.
Associative
merge(merge(A, B), C) = merge(A, merge(B, C)). The grouping of merges does not matter.
Idempotent
merge(A, A) = A. Merging a state with itself has no effect.
These three properties guarantee that no matter what order replicas receive updates, no matter how many times they merge, they always converge to the same final state.
There is no scenario where two replicas can diverge permanently.
The math makes conflict resolution unnecessary by construction.
Two Approaches
- State-based CRDTs (CvRDTs): Each replica maintains its full state. When replicas synchronize, they exchange their complete states and merge them using the merge function. The advantage is simplicity: the only requirement from the network is some form of gossip protocol. The disadvantage is bandwidth: sending the entire state on every sync can be expensive for large data structures.
- Operation-based CRDTs (CmRDTs): Instead of sending full states, replicas send only the operations (increments, adds, removes). The receiving replica applies the operations to its local state. The advantage is smaller message sizes. The disadvantage is that the network must guarantee causal delivery (operations arrive in causal order) and that operations are not dropped or duplicated. In practice, this requires a reliable causal broadcast protocol underneath.
The two approaches are theoretically equivalent: any state-based CRDT can be emulated by an operation-based one, and vice versa.
The choice in practice depends on your network characteristics and data structure sizes.
The Four Essential CRDT Types
G-Counter (Grow-Only Counter)
The simplest CRDT. It supports only increment operations, not decrements.
Each of N replicas maintains its own counter.
The state is a vector of N integers, one per replica.
When replica i increments, it increments only its own entry: counters[i] += 1.
The merge function takes the element-wise maximum of two vectors.
The total count is the sum of all entries.
Because each replica only increments its own entry, and the merge takes the maximum, increments are never lost and never double-counted.
If replica A has counters [5, 0, 0] and replica B has [3, 7, 0], the merge produces [5, 7, 0] and the total is 12. The 5 increments on A and the 7 increments on B are both preserved.
Use Case
Page view counters, like counts, event counters. Any metric that only goes up.
PN-Counter (Positive-Negative Counter)
A counter that supports both increment and decrement by combining two G-Counters.
The PN-Counter contains two G-Counters: one for increments (P) and one for decrements (N).
The current value is sum(P) - sum(N).
Increment adds to the P counter.
Decrement adds to the N counter. Merging merges each G-Counter independently.
Both the P and N components are G-Counters, which merge correctly.
The difference of two correctly merged sums is itself correct.
No increments or decrements are lost.
Use Case
Inventory counts (items added and removed), vote tallies (upvotes and downvotes), any bidirectional counter.
G-Set (Grow-Only Set)
A set that supports only add operations, not remove.
Each replica maintains a set of elements. Adding an element inserts it into the local set. Merging two replicas takes the union of both sets.
Once an element is added, it can never be removed.
Set union is commutative, associative, and idempotent. Adding the same element twice has no additional effect.
Use Case
Tag collections, event logs (append-only), user role assignments where roles are only granted.
OR-Set (Observed-Remove Set)
The most practically useful set CRDT. It supports both add and remove operations with intuitive semantics.
Each add operation generates a unique tag (typically a UUID or a pair of replica ID and sequence number).
The set stores elements along with their tags. Adding an element inserts it with a new unique tag. Removing an element removes all tags that the removing replica has observed for that element.
If another replica concurrently adds the same element with a new tag that the removing replica has not observed, the add wins.
The element stays in the set.
The add-wins semantics resolve the fundamental conflict: if one replica adds an element while another concurrently removes it, the add wins because the remove only affects tags it has seen, not future tags.
This matches the intuition that "add should win over concurrent remove" because the remover did not know about the new add.
Use Case
Shopping carts (items added and removed), collaborative document element management, any set where both add and remove operations are needed.
Amazon's Dynamo-style systems use OR-Set-like structures for shopping cart replication, which is why two users adding items to the same cart concurrently never lose either addition.
LWW-Register (Last-Writer-Wins Register)
The simplest register CRDT. It stores a single value with a timestamp.
Each write attaches a timestamp (physical or logical) to the value.
The merge function compares timestamps and keeps the value with the higher timestamp.
If timestamps are equal, a tiebreaker (like replica ID) determines the winner.
The maximum function over timestamps is commutative, associative, and idempotent. Two replicas that have seen the same set of writes will always pick the same winning value.
Use Case
User profile fields (display name, email), session state, configuration values. Any single-valued field where "most recent write wins" is acceptable semantics. The trade-off is that concurrent writes to the same register lose one value. If two users concurrently update a display name, one name is discarded. For fields where this is unacceptable, an MV-Register (multi-value register) preserves all concurrent values for application-level resolution.
Delta-Based CRDTs: The Practical Middle Ground
Pure state-based CRDTs send the entire state on every sync, which is wasteful when only a small part of the state changed.
Delta-based CRDTs solve this by sending only the delta (the difference since the last sync).
A delta-based CRDT tracks which part of the state changed since the last synchronization with each peer.
Instead of sending the full vector of counters, it sends only the counters that changed.
The receiving replica merges the delta into its state using the same merge function.
The delta is itself a valid CRDT state, so the merge properties (commutative, associative, idempotent) still hold.
This is the approach used by most production CRDT implementations because it combines the simplicity of state-based CRDTs (no causal delivery requirement) with the bandwidth efficiency of operation-based CRDTs (small messages).
Riak's dotted version vectors and Redis Enterprise's CRDT implementation both use delta-based synchronization under the hood.
Where CRDTs Are Used in Production
- Redis: Redis Enterprise provides CRDT-based data types for active-active geo-distributed deployments. Counters, sets, and strings use CRDT merge semantics so that writes to any region are automatically reconciled without conflicts.
- Riak: Riak (before its discontinuation) provided built-in CRDT data types: counters, sets, maps, and flags. These were exposed as first-class API operations, allowing developers to use CRDTs without implementing the merge logic themselves.
- Apple Notes: Apple uses CRDTs in the Notes app to synchronize offline edits between iPhone, iPad, and Mac. When a user edits a note on their phone while the phone is in airplane mode, the edits are merged with any changes made on other devices when the phone reconnects.
- Figma: Figma's real-time collaborative design tool uses CRDT-inspired techniques (with some server coordination) to allow multiple designers to edit the same canvas simultaneously without conflicts.
- SoundCloud: SoundCloud uses CRDTs for distributed data management in their audio distribution platform, handling concurrent updates to metadata across replicas.
- Azure Cosmos DB: Cosmos DB offers multi-region write capabilities with conflict resolution policies that include CRDT-like automatic merge for certain data types.
For understanding how database architectures like these handle distributed data, including the consistency trade-offs they make, Beyond Relational: Understanding Vector Databases and NewSQL Architectures covers modern database paradigms.
Trade-Offs and Limitations
CRDTs are not a universal solution.
They are powerful for specific use cases but come with significant constraints.
- Limited data types: CRDTs work well for counters, sets, registers, and flags. They are harder to design for complex data structures like ordered lists (needed for collaborative text editing) and graphs. Sequence CRDTs for text editing (like RGA, WOOT, and Yjs) exist but are complex to implement and have non-trivial performance characteristics. Not every data model can be expressed as a CRDT.
- Metadata overhead: CRDTs carry metadata (unique tags, version vectors, tombstones) that grows over time. An OR-Set that has had millions of add and remove operations accumulates tombstones (markers for removed elements) that consume memory and storage. Production systems need garbage collection mechanisms to periodically compact this metadata, which itself requires some form of coordination.
- No support for invariants: CRDTs cannot enforce global invariants. A PN-Counter can go negative (total decrements exceed total increments across replicas) because each replica operates independently. If your business logic requires that a counter never goes below zero (like an inventory count), CRDTs alone cannot enforce this. You need additional coordination for invariant-preserving operations.
- Read-your-writes is not guaranteed: Because replicas converge asynchronously, a client that writes to one replica and reads from another might not see its own write. This can be confusing for users. Production systems mitigate this by routing reads and writes from the same client to the same replica (sticky sessions), but this adds complexity. For understanding the broader consistency spectrum and how to choose between models, How to Choose Between Strong, Eventual, and Causal Consistency (With Tunable Quorums) covers the decision framework.
- Semantic mismatch: Some operations do not have natural CRDT semantics. Consider a bank account balance: two concurrent withdrawals of 50 each from a 75 account should be rejected (overdraft), but a PN-Counter CRDT would allow both, resulting in a negative balance. For operations that require coordination (checking a constraint before proceeding), consensus is more appropriate than CRDTs.
CRDTs vs Other Approaches
CRDTs vs Consensus (Paxos/Raft)
Consensus provides strong consistency: all replicas agree on the same value before a write is confirmed.
CRDTs provide strong eventual consistency: replicas may diverge temporarily but are guaranteed to converge.
Consensus blocks during partitions.
CRDTs remain available.
Use consensus for data that must be consistent at all times (financial transactions, leader election).
Use CRDTs for data that can tolerate temporary divergence (counters, collaborative editing, shopping carts).
CRDTs vs Last-Writer-Wins (LWW)
LWW is the simplest conflict resolution: attach a timestamp to every write, and the write with the highest timestamp wins. It is simple but lossy.
If two users concurrently edit different fields of the same document, LWW discards one user's changes entirely. CRDTs preserve all concurrent changes and merge them.
Use LWW only when losing concurrent updates is acceptable (user profile "last seen" timestamps, cache entries).
CRDTs vs Operational Transformation (OT)
OT is the technique used by Google Docs for real-time collaborative editing.
OT requires a central server to transform operations against each other.
CRDTs do not require a central server and work in peer-to-peer networks.
However, OT has a longer track record for text editing specifically, and CRDTs for text editing (sequence CRDTs) are still an active area of research and engineering.
For background workers and asynchronous processing where consistency trade-offs arise frequently, Strong vs. Eventual Consistency: Designing Background Workers for High Traffic covers the practical patterns.
How This Shows Up in System Design Interviews
CRDTs come up when interviewers ask about multi-region writes, conflict resolution, or how collaborative applications work.
You do not need to derive the semilattice proofs, but you should understand the concept, the trade-offs, and when CRDTs are appropriate.
Here is how to present it:
"For the collaborative shopping list feature, users need to add and remove items from any device, even while offline. I would model the shopping list as an OR-Set CRDT. Each device maintains a local replica. When the device reconnects, it merges its state with the server using the OR-Set merge function: adds are tagged with unique identifiers, removes only affect observed tags. This guarantees that concurrent adds on different devices are both preserved, and an add concurrent with a remove results in the item staying in the list. The trade-off is that the OR-Set carries metadata (tags and tombstones) that grows over time. I would implement periodic garbage collection during sync to compact the metadata. For the payment processing path, I would not use CRDTs because account balances require invariant checking (no overdraft), which CRDTs cannot enforce."
That answer covers why CRDTs fit the use case (offline-capable, multi-device), the specific CRDT type (OR-Set), how the merge works (unique tags, add-wins semantics), the metadata trade-off (tombstones, garbage collection), and where CRDTs are not appropriate (payment processing with invariants).
Common Mistakes
-
Using CRDTs for everything: CRDTs are excellent for counters, sets, and collaborative editing. They are a poor fit for data that requires global invariants (inventory that cannot go negative, bank balances that cannot overdraft, unique username constraints). Recognize when consensus is the right tool.
-
Ignoring tombstone growth: Every remove operation in an OR-Set creates a tombstone. Over time, tombstones can consume more space than the actual data. Without garbage collection, CRDT state grows unboundedly. Plan for periodic compaction.
-
Confusing CRDTs with eventual consistency: CRDTs provide strong eventual consistency, which is a stronger guarantee than plain eventual consistency. With plain EC, replicas might converge "eventually" through ad-hoc conflict resolution that may require rollbacks. With CRDTs (SEC), replicas converge deterministically as soon as they have received the same updates, with no rollbacks and no conflicts.
-
Forgetting about causal delivery for operation-based CRDTs: Operation-based CRDTs require that operations are delivered in causal order and exactly once. If your messaging layer drops, duplicates, or reorders messages, the CRDT guarantees break. State-based CRDTs are more forgiving because they send full state, which is inherently idempotent.
-
Not specifying the CRDT type in interviews: Saying "I would use a CRDT" without specifying which type (G-Counter, OR-Set, LWW-Register) is like saying "I would use a database" without specifying which one. The choice of CRDT depends on the operations your data needs to support.
-
Assuming CRDTs eliminate all coordination: CRDTs eliminate coordination for writes, but garbage collection, schema changes, and some administrative operations still require coordination. CRDTs reduce coordination, they do not eliminate it entirely. Even in a fully CRDT-based system, you will likely need some form of traditional coordination for operations like removing a replica from the cluster, compacting tombstones, or migrating to a new CRDT schema. The promise of CRDTs is not "zero coordination" but "no coordination on the hot path of reads and writes."
Conclusion: Key Takeaways
-
CRDTs are data structures whose merge operation is mathematically guaranteed to be conflict-free. The merge is commutative, associative, and idempotent, ensuring convergence regardless of update order.
-
State-based CRDTs send full state and merge via join semilattice. Operation-based CRDTs send operations and require causal delivery. Both approaches are equivalent in theory; the choice depends on network characteristics.
-
Four essential types cover most use cases. G-Counter (grow-only counter), PN-Counter (bidirectional counter), G-Set (grow-only set), OR-Set (add and remove with add-wins semantics).
-
CRDTs provide strong eventual consistency. Replicas that have received the same updates are guaranteed to be in the same state, immediately, with no rollbacks.
-
CRDTs cannot enforce global invariants. Use them for data that can merge independently (counters, sets, collaborative edits). Use consensus for data that requires constraints (balances, unique identifiers).
-
In interviews, name the specific CRDT type and explain its merge semantics. Saying "I would use an OR-Set CRDT with add-wins semantics for the shopping cart" is far stronger than "I would use eventual consistency."
What our users say
Ashley Pean
Check out Grokking the Coding Interview. Instead of trying out random Algos, they break down the patterns you need to solve them. Helps immensely with retention!
Steven Zhang
Just wanted to say thanks for your Grokking the system design interview resource (https://lnkd.in/g4Wii9r7) - it helped me immensely when I was interviewing from Tableau (very little system design exp) and helped me land 18 FAANG+ jobs!
MO JAFRI
The courses which have "grokking" before them, are exceptionally well put together! These courses magically condense 3 years of CS in short bite-size courses and lectures (I have tried System Design, OODI, and Coding patterns). The Grokking courses are godsent, to be honest.
Access to 50+ courses
New content added monthly
Certificate of completion
$29.08
/month
Billed Annually
Recommended Course

Grokking the System Design Interview
169,039+ students
4.7
Grokking the System Design Interview is a comprehensive course for system design interview. It provides a step-by-step guide to answering system design questions.
View CourseRead More
Demystifying Big-O Notation: The Ultimate Guide to Understanding Algorithm Complexity
Arslan Ahmad
Meta Product Design Interview: A Comprehensive Guide for Software Engineers
Arslan Ahmad
Things to Do One Day Before Your Coding Interview
Arslan Ahmad
Grokking Messaging Patterns: Queues, Pub/Sub, and Event Streams
Arslan Ahmad