Consistency, Availability, Partition Tolerance
The CAP theorem establishes that a distributed system cannot simultaneously achieve all three of the following guarantees: Consistency, Availability, and Partition Tolerance. When designing a distributed system, one of the first considerations is how to balance and trade off among these three properties. The theorem dictates that we can only select two of the following:
Consistency: All nodes in the system display the same data at any given time. This is ensured by updating multiple nodes before permitting further read operations.
Availability: Each request receives a definitive response, indicating either success or failure. This is accomplished by distributing copies of data across various servers.
Partition Tolerance: The system remains functional even when there is a loss of messages or partial network failures. A partition-tolerant system can handle any degree of network disruptions that do not cause the entire network to fail. To achieve this, data is replicated sufficiently across nodes and networks to maintain availability during outages or interruptions.
It is not possible to create a general-purpose data store that is always available, sequentially consistent, and resilient to any partition failures simultaneously. A system can only provide two out of these three properties. This limitation arises because maintaining consistency requires that all nodes receive updates in the same sequence and reflect the same state. However, during a network partition, updates in one partition may not reach the others before a client reads data. This could result in the client accessing outdated data after previously reading the up-to-date state. The only way to address this issue is to stop processing requests from the out-of-date partition, but doing so compromises the system's ability to remain fully available.