CAP

1. Basics

Distributed Systems: Applies to systems that run on multiple computers connected over a network.

You can only guarantee two out of three properties:

Consistency (C): Everyone sees the same data at the same time.
Availability (A): The system is always up and running (responds to users t all times), even with some failures.
Partition Tolerance (P): The system keeps working even if network problems split it into parts.

Network Problems are unavoidable, so partition tolerance is a must.

Have to choose between consistency and availability when network problems happen:

CP Systems: Prioritize consistency: will become unavailable partition issues.
AP Systems: Prioritize availability: may show outdated data during partition issues.

This refers to the processes that occur after a network partition has been resolved, where the data may have diverged across different nodes.
The goal is to restore consistency across the system while maintaining high availability.

Eventual Consistency: This model ensures that, given enough time, all updates will propagate through the system and all replicas will become consistent.
Conflict Resolution: Strategies to handle conflicting updates (e.g., last-write-wins, version vectors, or custom merge strategies).
Data Reconciliation Techniques:
- Merging: Combining differing versions of the data.
- Gossip Protocols: Nodes regularly exchange state information, allowing eventual convergence.
- Quorum Reads/Writes: Ensuring that reads/writes are acknowledged by a subset of nodes to maintain consistency.