3

Could someone explain to me some part of Mongodb linearizable read concern documentation:

Linearizable read concern guarantees only apply if read operations specify a query filter that uniquely identifies a single document.

Does it mean that I have to have unique index on fields that presented in query filter?

For example let's answer on 4 questions:

  1. I have collection test without unique index on A field. db.test.find({A:1}).readConcern("linearizable").maxTimeMS(10000)

    Is it linearizable and I can't get stale read? If answer yes, is it means that there no reason to use linearizable read concern in reads by fields which not presented in unique index?

  2. I have collection test with unique index on A field.
    db.test.ensureIndex({A:1}, {unique:true}); db.test.find({A:1}).readConcern("linearizable").maxTimeMS(10000);

    Is it linearizable and I can't get stale read?

  3. I have collection test with unique index on A field.
    db.test.ensureIndex({A:1}, {unique:true}); db.test.find({A:1, B:1}).readConcern("linearizable").maxTimeMS(10000);

    Is it linearizable and I can't get stale read?

  4. I have collection test without unique index on A field. But find method return only one document in result.
    db.test.find({A:1}).readConcern("linearizable").maxTimeMS(10000); //returned {_id:"someId", A:1}

    Is it linearizable and I can't get stale read?

Dmitrii Zyrianov
  • 2,208
  • 1
  • 21
  • 27

1 Answers1

5

Distributed database concepts can be quite hard to understand, let's cover some background before addressing the questions.

Linearizable Read Concern introduced in MongoDB v3.4, is to ensure applications are always reading the most up-to-date data from the correct (current/legitimate) primary node. This means that during a network partition, applications will not read:

  • Stale data i.e. may not reflect all writes that have occurred prior to the read operation, or
  • Uncommitted data i.e. the state of the data may reflect a write that has not been acknowledged by a majority or the replica set members and thus could be rolled back

Due to the complexity of tracking data in multiple states (i.e. propagated, committed) in multiple nodes (i.e. secondaries), the guarantee of linearizable read concern only apply if the read operation uniquely identifies a single document.

Does it mean that i have to have unique index on fields that presented in query filter?

Now to address your question, the query only have to return one unique document in a collection. It is not necessary for the collection to have a unique index, although using a unique index will help the query to return a single document. For example, specifying a query filter with _id. As field name _id is reserved for use as a primary key; its value must be unique in the collection.

You may also be interested to read the following:

Wan B.
  • 18,367
  • 4
  • 54
  • 71
  • Wan can you answer one additional question? As I know, linearisable consistency is working through noop writes. If write complete it means that we are master node and we can return answer to the client: [Replication-Internals](https://github.com/mongodb/mongo/wiki/Replication-Internals#read-concern). My additional question is: **Why linearizable impose restrictions on a single document? Why single? If we are master, why we can't return array of documents?** – Dmitrii Zyrianov Jan 30 '18 at 17:30
  • @DmitryZyr I've addressed this but I'll elaborate. Your assumption of `we are master` is not always true, given 1) the data is also have to be tracked on secondary nodes whether they data is propagated and committed 2) In case of network partition, primary may not be primary. Also you need to consider the complexity of checking per document in secondary. For example, for only 2 documents, after one document has been confirmed to be propagated and committed to the majority of replica set, while checking for the second document there is no guarantee the first document has been updated. – Wan B. Jan 31 '18 at 00:42