0

I am trying to debug a slow query. I have an index set up as follows:

db.myCollection.getIndices()
...
{
  "key" : {
    "field1" : 1,
    "field2" : 1
  },
  "background" : true
}
...

Here's one of the super slow queries:

{
  "op" : "update",
  "query" : {
    "field1" : "some_f1_value",
    "field2" : {
      "$exists" : 1
    }
  },
  "millis" : 59302,
  "ts" : ISODate("2016-11-09T19:43:41.754Z")
}

Running explain(), I get the following:

db.myCollection.find({ "field1" : "some_f1_value", "field2" : { "$exists" : 1 } }).explain()

    "parsedQuery" : {
      "$and" : [
        {
          "field1" : {
            "$eq" : "some_f1_value"
          }
        },
        {
          "field2" : {
            "$exists" : true
          }
        }
      ]
    },
    "winningPlan" : {
      "stage" : "KEEP_MUTATIONS",
      "inputStage" : {
        "stage" : "FETCH",
        "filter" : {
          "field2" : {
            "$exists" : true
          }
        },
        "inputStage" : {
          "stage" : "IXSCAN",
          "keyPattern" : {
            "field1" : 1
          },
          "indexName" : "field1_1",
          "isMultiKey" : false,
          "direction" : "forward",
          "indexBounds" : {
            "field1" : [
              "[\"some_f1_value\", \"some_f1_value\"]"
            ]
          }
        }
      }
    }

Seems like it is not using the index on field1 and field2 together, but rather another index on only field1, and then doing a scan to find the matching field2. Is there an explanation for this, and could this be why the query is slow?

Edit: Ok, very interesting. I tested three queries:

1) db.myCollection.find({ "field1" : "val1", "field2":{$exists:1}}).explain() -- this is slow, as described above, and only uses the field1 index

2) db.myCollection.find({ "field1" : "val1", "field2":{$exists:1}}).hint({field1:1, field2:1}).explain() -- this is fast, and properly uses the field1+field2 index

3) db.myCollection.find({ "field1" : "val1", "field2":"val2"}).explain() -- this is also fast, and properly uses the field1+field2 index

I should clarify that "val1" was actually different for each query (so as to not use the OS cache that mongo uses) but each query returns about the same number of docs.

So it seems like Mongo just doesn't use the index for $exists queries. There's some discussion about that here:

Can MongoDB use an index when checking for existence of a field with $exists operator?

and it seems like people think this was fixed as of 2.4. Seems to not be the case though, as I'm using 3.0.6.

Community
  • 1
  • 1
maxko87
  • 2,892
  • 4
  • 28
  • 43
  • You can try adding a sparse/partial index for field 2 to see if that helps. https://docs.mongodb.com/manual/core/index-sparse/ – dyouberg Nov 09 '16 at 21:28
  • Thanks for the comment. Why do you think that'll help? Does the index being spared change when it is used by the query planner? I thought making it sparse was just for more efficient storage. – maxko87 Nov 09 '16 at 21:50
  • Making the index sparse shouldn't have an impact; sparse compound indexes will have entries as long as the document contains one of the keys (https://docs.mongodb.com/v3.2/core/index-sparse/#sparse-compound-indexes). I.e. Making the index sparse won't remove entries where "field1" matches and "field2" doesn't exist. --- What is the cardinality of docs where field2 exists vs. where it doesn't exist? Have you tried hinting the compound index and comparing the output and performance? – Adam Harrison Nov 09 '16 at 22:57
  • @AdamHarrison, field1 is in all documents and field2 is in relatively few. What do you mean by hitting the compound index? Using `$hint`? – maxko87 Nov 09 '16 at 23:10
  • @maxk087 yes, use $hint to tell the query planner to use the compound index. – Adam Harrison Nov 09 '16 at 23:16
  • @AdamHarrison thanks. See my edit to the question above. – maxko87 Nov 10 '16 at 00:57

0 Answers0