Category Archives: NoSQL

CAP Theorem – Consistency Availability and partition Tolerance

  • Consistency – It means that data is the same across the cluster, so you can read or write to/from any node and get the same data.
  • Availability – It means the ability to access the cluster even if a node in the cluster goes down.
  • Partition Tolerance – It means that the cluster continues to function even if there is a “partition” (communications break) between two nodes (both nodes are up, but can’t communicate).

In order to get both availability and partition tolerance, you have to give up consistency. Consider if you have two nodes, X and Y, in a master-master setup. Now, there is a break between network communication in X and Y, so they can’t sync updates. At this point you can either:

A) Allow the nodes to get out of sync (giving up consistency), or

B) Consider the cluster to be “down” (giving up availability)

All the combinations available are:

  • CA – data is consistent between all nodes – as long as all nodes are online – and you can read/write from any node and be sure that the data is the same, but if you ever develop a partition between nodes, the data will be out of sync (and won’t re-sync once the partition is resolved). In practice however, there are of course CA systems, e.g. single-node RDBMS. Or even master-slave/master-master replicated RDBMS, provided there’s a central router knowing which nodes live and directing client appropriately. Such a router is then a single poin
  • CP – data is consistent between all nodes, and maintains partition tolerance (preventing data desync) by becoming unavailable when a node goes down.
  • AP – nodes remain online even if they can’t communicate with each other and will re-sync data once the partition is resolved, but you aren’t guaranteed that all nodes will have the same data (either during or after the partition).

CAP triangle can be visualized like shown below

CAPTriangle

CAP – CA, CP and AP in summary can be visualized like shown below, all below doagrams thanks to Slide share presentations

CAP-AP-CP-CA-Combinations

Choosing AP

CAP-AP

Choosing CP

CAP-CP

 

Every one is aware of ACID

  • Atomic: Everything in a transaction succeeds or the entire transaction is rolled back.
  • Consistent: A transaction cannot leave the database in an inconsistent state.
  • Isolated: Transactions cannot interfere with each other.
  • Durable: Completed transactions persist, even when servers restart etc.

ACID transactions are far more pessimistic (i.e., they’re more worried about data safety) than the domain actually requires.

In the NoSQL world, ACID transactions are less fashionable as some databases have loosened the requirements for immediate consistency, data freshness and accuracy in order to gain other benefits, like scale and resilience. THta’s where BASE acronym comes into picture

Basic Availability

    • The database appears to work most of the time.

Soft-state

    • Stores don’t have to be write-consistent, nor do different replicas have to be mutually consistent all the time.

Eventual consistency

    • Stores exhibit consistency at some later point (e.g., lazily at read time).

 

 

Advertisements