This explains everything — the CAP theorem

Several years ago, posed the question “What is your favorite deep, elegant, or beautiful explanation?”. The book “This explains everything” compiled 150 answers from renowned authors and thinkers in physics, biology, economics, and computer science. This essay is in the same vein.

Accurate, ubiquitously available facts in a world disjointed by strife and confusion — that is a holy grail of the scientific community, if not of all humanity. If we could only make verifiably true facts available to everyone, all the time; wouldn’t that dispel disinformation, calm confusion, and empower everyone? At the very least, we should be able to come to terms with all but the most incorrigible of truth-deniers — anyone not completely unplugged from reality ought to be able to start thinking and reasoning from the same set of facts as everyone else in the world.

Unfortunately, such a lofty goal is impossible in theory, let alone in practice.

The CAP theorem tells us that in any distributed information system, we can choose at most two of the following, never all three: data consistency, data availability, or network partition tolerance (i.e., the system’s ability to function despite partial network failure). Even in theory, we cannot have a robust, fault-tolerant, distributed system that can guarantee consistent data everywhere, all the time. (The formal, mathematical proof of the theorem defines all three attributes — C, A, and P — more precisely. However, this simplified version is sufficient and approximately correct.)

In practice, network partitioning is unavoidable: there is no way to guarantee 100% up-time for all connections over arbitrarily long periods of time throughout a network. Thus, we are forced to design our systems to be partition tolerant. Imagine your typical automated teller machine: it simply wouldn’t be acceptable to anyone if every ATM stopped working anytime some bank branch somewhere experienced a network outage. We have to design systems that can tolerate and work in a world with unreliable networks. This means that we must choose between data consistency and data availability. We cannot have both simultaneously.

As humanity makes immense strides, we are seduced to believe that there is nothing that technological innovation cannot solve. We are frustrated by human frailty, indifference, ignorance, and apathy. If we could only live up to the ideals of science, nothing is out of our grasp! It is at such moments of unbridled inspiration that we need sobering laws that remind us of the theoretical limits of optimism.

The second law of thermodynamics injects sobriety to would-be dreamers of perpetual motion machines. Gödel’s incompleteness theorem dampens the exuberance of logicians who wish to create perfect theories from a finite set of axioms. And the CAP theorem trims the whims of those who desire perfect distributed information systems.

We are all Icarus. Let us soar but with humility and caution, not with arrogance or heedlessness.