Published on

Tradeoffs In CAP Theorem

Authors
  • avatar
    Name
    Skim
    Twitter

Imagine you're running an online marketplace where customers place orders and vendors update product listings. In a distributed system, you need to juggle three properties:

  1. Consistency: If a shopper checks whether a product is in stock, they get the same answer regardless of which server handles the request. Once stock is updated, that change is reflected everywhere.
  2. Availability: Every request gets a response, even if some servers are down. Shoppers can keep browsing and buying even when a few servers are failing.
  3. Partition tolerance: Network failures are inevitable. The system keeps working even when servers can't talk to each other.

The Trade-off: Choosing Two out of Three

The CAP theorem says you can't fully guarantee all three at once. You have to pick two and accept weaker guarantees on the third:

Consistency and Availability (CA)

In scenarios like financial transactions, data consistency and uninterrupted availability are top priorities. During network partitions, the system might reduce availability to keep data correct.

Consistency and Partition Tolerance (CP)

When data integrity is critical, this combination keeps data consistent even during network disruptions, but availability might suffer.

Availability and Partition Tolerance (AP)

Systems that must stay responsive during network partitions accept eventual consistency — data might be temporarily inconsistent across nodes but will converge over time.

Real-world Applications

CA: AWS Relational Database Service (RDS)

AWS RDS is a managed database service supporting MySQL, PostgreSQL, and others. It focuses on giving you the most recent data on every read and handling failovers and backups automatically to minimize downtime. During a network partition, RDS leans toward consistency over immediate availability.

CP: Google Cloud Spanner

Spanner is a globally distributed database that uses a technology called TrueTime to synchronize data across data centers with accurate timestamps. This gives it strong global consistency even during network partitions, at the cost of some availability. That trade-off makes it a good fit for financial systems and other applications that need precise data synchronization.

AP: Apache Cassandra

Cassandra is a NoSQL database built around eventual consistency. Data changes propagate to all nodes over time, even during network disruptions, so the system stays available and responsive. This makes it a common choice for IoT data storage, time series data, and content delivery networks — use cases where uptime matters more than immediate consistency.