Understand the fundamental trade-off in distributed systems: Consistency, Availability, and Partition Tolerance.
The CAP Theorem, also known as Brewer's Theorem, is a fundamental principle in distributed systems design. It states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees: Consistency, Availability, and Partition Tolerance. Let's define these terms. Consistency means that every read operation receives the most recent write or an error. All nodes in the system see the same data at the same time. Availability means that every request receives a (non-error) response, without the guarantee that it contains the most recent write. The system is always up and running to respond to requests. Partition Tolerance means that the system continues to operate even if there is a 'partition' in the network, i.e., messages being lost or delayed between nodes. Since network partitions are a fact of life in distributed systems, partition tolerance (P) is a property that must be supported. Therefore, the CAP theorem forces a trade-off between Consistency (C) and Availability (A). A system can choose to be CP: it remains consistent even during a network partition, but it does so by becoming unavailable (e.g., refusing read/write requests to the partitioned node). Or a system can choose to be AP: it remains available during a partition, but some nodes might return older, 'stale' data until the partition is resolved, thus sacrificing strong consistency. Traditional relational databases typically choose CP, while many NoSQL databases are designed to be AP, prioritizing availability and scalability.