CAP

1. Basics

Distributed Systems: Applies to systems that run on multiple computers connected over a network.

2. The Tradeoff

You can only guarantee two out of three properties:

  • Consistency (C): Everyone sees the same data at the same time.
  • Availability (A): The system is always up and running (responds to users t all times), even with some failures.
  • Partition Tolerance (P): The system keeps working even if network problems split it into parts.

Network Problems are unavoidable, so partition tolerance is a must.

3. The Choice

Have to choose between consistency and availability when network problems happen:

  • CP Systems: Prioritize consistency: will become unavailable partition issues.
  • AP Systems: Prioritize availability: may show outdated data during partition issues.

4. Post Partition Reconciliation

4.0.1. Post-Partition Reconciliation

  • This refers to the processes that occur after a network partition has been resolved, where the data may have diverged across different nodes.
  • The goal is to restore consistency across the system while maintaining high availability.

4.0.2. Key Concepts in Post-Partition Reconciliation

  • Eventual Consistency: This model ensures that, given enough time, all updates will propagate through the system and all replicas will become consistent.
  • Conflict Resolution: Strategies to handle conflicting updates (e.g., last-write-wins, version vectors, or custom merge strategies).
  • Data Reconciliation Techniques:
    • Merging: Combining differing versions of the data.
    • Gossip Protocols: Nodes regularly exchange state information, allowing eventual convergence.
    • Quorum Reads/Writes: Ensuring that reads/writes are acknowledged by a subset of nodes to maintain consistency.

5. Latency and Consistency

Tags::cs: