NoSQL: What does it mean for MongoDB or BigTable to not always be "Available"

Question

Reading Nathan Hurst's Visual Guide to NoSQL Systems, he includes the CAP triangle:

Consistency
Availibility
Partition Tolerance

enter image description here

With SQL Server being an AC system, and MongoDB being a CP system.

These definitions from come a UC Berkley professor Eric Brewer, and his talk at PODC 2000 (Principles of Distributed Computing):

Availability

Availability means just that - the service is available (to operate fully or not as above). When you buy the book you want to get a response, not some browser message about the web site being uncommunicative. Gilbert & Lynch in their proof of CAP Theorem make the good point that availability most often deserts you when you need it most - sites tend to go down at busy periods precisely because they are busy. A service that's available but not being accessed is of no benefit to anyone.

What does it mean, in the context of MongoDB, or BigTable, for the system to not be "available"?

Do you go to connect (e.g. over TCP/IP), and the server does not respond? Do you attempt execute a query, but the query never returns - or returns an error?

What does it mean to not be available?

Chris Shain · Accepted Answer · 2011-09-08T21:13:07.413

13

Availability in this case means that in the event of a network partition, the server that a client connects to may not be able to guarantee the level of consistency that the client expects (or that the system is configured to provide).

Assuming that you have 3 nodes, A, B, and C, in a hypothetical distributed system. A, B, and C are each running in their own rack of servers, with 2 switches between them:

[Node A] <- Switch #1 -> [Node B] <- Switch #2 -> [ Node C ]

Now assume that said system is set up so that it is GUARANTEED that any write will go to at least 2 nodes before it is considered committed. Now, lets assume that switch #2 gets unplugged, and some client is connected to node C:

[Node A] <- Switch #1 -> [Node B]                 [ Node C ] <-- Some client

That client will not be able to issue Consistent writes, because the distributed system is currently in a partitioned state (namely, Node C cannot contact enough other nodes to guarantee the 2-node consistency required).

I'd add to this that some NoSQL databases allow very dynamic selection of CAP attributes. Cassandra, for instance, allows clients to specify the number of servers that a write must go to before it is committed on a per-write basis. Writes going to a single server are "AP", writes going to a quorum (or all) servers are more "CA".

EDIT - from the comments below:

In MongoDB you can only have master/slave configuration within a replica set. What this means is that the choice of AP vs CP is made by the client at query time. The client can specify slaveOk, which will read from an arbitrarily selected slave (which may have stale data): mongodb.org/display/DOCS/…. If the client is not OK with stale data, don't specify slaveOk and the query will go to the master. If the client cannot reach the master, then you'll get an error. I'm not sure exactly what that error will be.

edited Sep 08 '11 at 21:13

answered Sep 07 '11 at 19:41

Chris Shain

50,833
6
93
125

It looks to me that your `A`-`B`-`C` nodes example refers to an *available* & *partition tolerant* system, that isn't necessarily *consistent* (i.e. Node `C` is available, but not `C`onsistent). What would the diagram for a system be that is *consistent*, and *partition tolerant*, but not necessarily *available*? – Ian Boyd Sep 07 '11 at 20:07
3

It is not Available for writes because a Consistent write cannot be performed. It is not Available for reads because a Consistent read *can* be performed on Nodes A & B, and this any reads on C are not necessarily going to return the current correct (Consistently committed) value. – Chris Shain Sep 07 '11 at 20:16
1

An *available & partition tolerant* system would be the same as above if writes were only required to go to one node. It would not be Consistent, because in that case, even in the event of the Partition, all nodes would be Available for writes and reads, but writes and reads would not be Consistent (you could write on Node A, and I could write a different value on Node B). – Chris Shain Sep 07 '11 at 20:19
1

Sorry, 1st comment above should read: It is not Available for reads because a Consistent *write* can be performed on Nodes A & B, and this any reads on C are not necessarily going to return the current correct (Consistently committed) value – Chris Shain Sep 07 '11 at 20:27
Are you saying that the database would not "`A`vailable" when there a failure of communication between nodes? The database would realize that not all nodes are communicating and would not allow the user to read or write from the database? How would that denial of access be presented to a user? – Ian Boyd Sep 07 '11 at 21:08
The specific reaction to that situation is specific to the system implementation. Some systems allow inconsistent reads in the AP scenario (e.g. Cassandra), others are configurable. See this great series of articles on Mongo's configurability in this regard: http://blog.mongodb.org/post/498145601/on-distributed-consistency-part-2-some-eventual – Chris Shain Sep 07 '11 at 21:45
i've read that post, and we're not quite getting to the answer of my question. From the post "*For A-class, we need to weaken consistency constraints."* What if we don't weaken consistency? What if we weaken A? What happens when MongoDB is configured for consistency over availability? What happens when i try to read a value in a MongoDB that is configured for `CP`, sacrificing `A`, and there's internal inconsistency? Using the syntax of that post, what happens when i run `R(x)`? Does it return an error? Does it sit there stalled until consistency is returned? Does it terminate my connection? – Ian Boyd Sep 08 '11 at 14:14
Using the quoted definition of availability, "*Availability means just that - the service is available (to operate fully or not as above). When you buy the book you want to get a response, not some browser message about the web site being uncommunicative.*" Well MongoDB is not a web-server, it's a database. A web-server that is not available returns `500` if the server is unavailable. What does MongoDB return when the database is not consistent? i know it doesn't return `500`, since it's not a web-server. – Ian Boyd Sep 08 '11 at 14:21
From the page i quoted and linked; one solution , "***Drop Availability** This is the flip side of the drop-partition-tolerance coin. On encountering a partition event, affected services simply wait until data is consistent and therefore remain unavailable during that time. Controlling this could get fairly complex over many nodes, with re-available nodes needing logic to handle coming back online gracefully.*" Can MongoDB/BigTable/Hypertable/Hbase/Terrastore/Scalaris/Berkley DB/MemcacheDB/Redis be configured to do this? Are they, or can they be, `CP` systems? – Ian Boyd Sep 08 '11 at 14:43
In MongoDB you can only have master/slave configuration within a replica set. What this means is that the choice of AP vs CP is made by the client at query time. The client can specify slaveOk, which will read from an arbitrarily selected slave (which may have stale data): http://www.mongodb.org/display/DOCS/Querying#Querying-slaveOk%28QueryingSecondaries%29. If the client is not OK with stale data, don't specify slaveOk and the query will go to the master. If the client cannot reach the master, then you'll get an error. I'm not sure exactly what that error will be. – Chris Shain Sep 08 '11 at 16:54
Cassandra is similar to MongoDB in terms of consistency, but the consistency is determined by the writer (by specifying the number of nodes to commit to), not the reader. In Cassandra, a query will never fail due to (in)consistency, but you may get stale data if the writer of that data specified a small number of commit nodes *and* the data has not yet propagated to other nodes. HBase on the other hand is always consistent- if the data cannot be written to enough nodes to ensure the configured replication factor, the write fails. – Chris Shain Sep 08 '11 at 17:00
"*If the client is not OK with stale data, don't specify slaveOk and the query will go to the master. If the client cannot reach the master, then you'll get an error.*" Excellent, put that in the form of an answer and we have ourselves and accepted! – Ian Boyd Sep 08 '11 at 19:41

score 6 · Answer 2 · edited Feb 06 '12 at 07:33

The CAP theorem applies to distributed computer systems. MongoDB supports two distinct forms of distributed computing: sharding for horizontal scaling and replica sets for failover/high availability. The two can be used together or independently. I think the CAP theorem applies slightly differently to the two forms:

Sharding level - MongoDB stores data on at most one authoritative shard.

Strong Consistency: A piece of data exists on at most one shard. Incorrect/stale data does not exist.
Strong Partition-tolerance: Even if network partitioned, requests never return incorrect/stale data. Shards continue working independent of other shards.
Weak Availability: Reads/writes of data on a downed shard will fail.

Replica set level - MongoDB replicates data within a shard, ensuring consistency via a single, authoritative primary node.

Strong Consistency: All reads/writes handled by the primary node.
Strong Partition-tolerance: If enough nodes become unreachable, a new primary is elected. The election process ensures there is always at most one primary node.
Weak Availability: Reads/writes will fail when no primary exists, even though the data could be accessed via secondary nodes.

The slaveOK/ReadPreference.SECONDARY option sacrifices some consistency (stale data can be read) for increased performance and availability.

NoSQL: What does it mean for MongoDB or BigTable to not always be "Available"

2 Answers2

Linked