What is the CAP theorem?

CAP theorem, formulated by Eric Brewer, states that in a distributed data store, only two out of the following three guarantees can be achieved at the same time:

  1. Consistency: Every read from the database gets the latest (or most recent) write or an error. In our party analogy, this is like ensuring that every guest hears the latest song being played, no matter which hall they're in.

  2. Availability: Every request (read/write) receives a response, without the guarantee that it contains the most recent write. In the party context, this means that everyone gets into a hall and hears a song, but it might not be the latest one playing in other halls.

  3. Partition Tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes. If one of our party halls loses connection to the others, the party in that hall still goes on.

Image

Why Is It Important?

In distributed systems (like databases spread across multiple locations or servers), network failures happen. The CAP theorem guides the design and understanding of these systems, helping to balance trade-offs between consistency and availability when partitions (network failures) occur.

Real-World Implications:

  1. CP (Consistency + Partition Tolerance): Ideal for systems where accuracy is crucial, like banking systems. If a partition happens, the system might choose to refuse transactions until consistency can be guaranteed.

  2. AP (Availability + Partition Tolerance): Great for systems where service availability is crucial, and it's okay to serve slightly stale data. Social media platforms often follow this model.

  3. CA (Consistency + Availability): This is not a realistic choice for distributed systems since partition tolerance is a must-have in any networked system. However, traditional RDBMS (Relational Database Management Systems) often fit into this category in non-distributed setups.

CAP in the Real World:

In practice, modern distributed systems often make trade-offs based on the current situation. For instance, they might normally prioritize consistency and availability, but in the event of a network partition, they might temporarily sacrifice consistency to maintain availability.

The CAP theorem is a key principle in understanding the limitations and design choices of distributed systems. It reminds us that in the world of distributed data, trade-offs are inevitable, and perfect solutions don't always exist.

Want to read mode on CAP theorem: CAP vs. PACELC

TAGS
System Design Interview
System Design Fundamentals
CAP Theorem
CONTRIBUTOR
Design Gurus Team
Explore Answers
Related Courses
Image
Grokking the Coding Interview: Patterns for Coding Questions
Image
Grokking Data Structures & Algorithms for Coding Interviews
Image
Grokking System Design Fundamentals