6

There are some questions 1, 2 talk about the MongoDB warning Query Targeting: Scanned Objects / Returned has gone above 1000, however, my question is another case.

The schema of our document is

{
    "_id" : ObjectId("abc"),
    "key" : "key_1",
    "val" : "1",
    "created_at" : ISODate("2021-09-25T07:38:04.985Z"),
    "a_has_sent" : false,
    "b_has_sent" : false,
    "updated_at" : ISODate("2021-09-25T07:38:04.985Z")
}

The indexes of this collections are

{
    "key" : {
        "updated_at" : 1
    },
    "name" : "updated_at_1",
    "expireAfterSeconds" : 5184000,
    "background" : true
},
{
    "key" : {
        "updated_at" : 1,
        "a_has_sent" : 1,
        "b_has_sent" : 1
    },
    "name" : "updated_at_1_a_has_sent_1_b_has_sent_1",
    "background" : true
}

The total number of documents after 2021-09-24 is over 600000, and the distinct value of key is 5.

The above waning caused by the query db.collectionname.find({ "updated_at": { "$gte": ISODate("2021-09-24")}, "$or": [{ "a_has_sent": false }, {"b_has_sent": false}], "key": "key_1"})

Our server sends one document to a and b simutinously with batch size 2000. After sending to a successfully, mark a_has_sent to true. The same logic to b. As sending process goes on, the number of documents with a_has_sent: false reduce. And the above warning comes up.

After checking the explain result of this query, the index named updated_at_1 is used rather than updated_at_1_a_has_sent_1_b_has_sent_1.

What we had tried.

  1. We add another new index {"updated_at": 1, "key": 1}, and expect this query could use the new index to reduce the number of scanned documents. Unfortunately, we failed. The index named updated_at_1 is still used.
  2. We try to replace find with aggregate aggregate([{"$match": { "updated_at": { "$gte": ISODate("2021-09-24") }, "$or": [{ "a_has_sent": false }, { "b_has_sent": false}], "key": "key_1"}}]). Unfortunately, The index named updated_at_1 is still used.

We want to know how to eliminate this warning Scanned Objects / Returned has gone above 1000?

Mongo 4.0 is used in our case.

zangw
  • 43,869
  • 19
  • 177
  • 214

2 Answers2

4

Follow the ESR rule

For compound indexes, this rule of thumb is helpful in deciding the order of fields in the index:

  • First, add those fields against which Equality queries are run.
  • The next fields to be indexed should reflect the Sort order of the query.
  • The last fields represent the Range of data to be accessed.

We create the index {"action_key" : 1,"adjust_sent" : 1,"facebook_sent" : 1,"updated_at" : 1}, this index could be used by the query now


Update 08/15/2022

Query Targeting alerts indicate inefficient queries.

Query Targeting: Scanned Objects / Returned occurs if the number of documents examined to fulfill a query relative to the actual number of returned documents meets or exceeds a user-defined threshold. The default is 1000, which means that a query must scan more than 1000 documents for each document returned to trigger the alert.

Here are some steps to solve this issue

  • First, The Performance Advisor provides the easiest and quickest way to create an index. If there is any create Indexes suggestion, you can create this recommended index.

  • Then, you could check the query profile if there is no recommended index in Performance Advisor. The Query Profiler contains several metrics you can use to pinpoint specific inefficient queries. The Query Profiler can show the Examined : Returned Ratio (index keys examined to documents returned) of logged queries, which might help you identify the queries that triggered a Query Targeting: Scanned / Returned alert. The chart shows the number of index keys examined to fulfill a query relative to the actual number of returned documents.

  • You can use the following resources to determine which query generated the alert:

    • The Real-Time Performance Panel monitors and displays current network traffic and database operations on machines hosting MongoDB in your Atlas clusters.

    • The MongoDB logs maintain an account of activity, including queries, for each mongod instance in your Atlas clusters.

      • The following mongod log entry shows statistics generated from an inefficient query:
      <Timestamp> COMMAND  <query>
      planSummary: COLLSCAN keysExamined:0
      docsExamined: 10000 cursorExhausted:1 numYields:234
      nreturned:4  protocol:op_query 358ms
      
      • This query scanned 10,000 documents and returned only 4 for a ratio of 2500, which is highly inefficient. No index keys were examined, so MongoDB scanned all documents in the collection, known as a collection scan
    • The cursor.explain() command for mongosh provides performance details for all queries.

    • The Data Profiler records operations that Atlas considers slow when compared to average execution time for all operations on your cluster.

      • Note - Enabling the Database Profiler incurs a performance overhead.
zangw
  • 43,869
  • 19
  • 177
  • 214
3

MongoDB cannot use a single index to process an $or that looks at different field values.

The index on

{
        "updated_at" : 1,
        "a_has_sent" : 1,
        "b_has_sent" : 1
}

can be used with the $or expression to match either a_has_sent or b_has_sent.

To minimize the number of documents examined, create 2 indexes, one for each branch of the $or, combined with the enclosing $and (the filter implicitly combines the top-level query predicates with and). Such as:

{
        "updated_at" : 1,
        "a_has_sent" : 1
}

and

{
        "updated_at" : 1,
        "b_has_sent" : 1
}

Also note that the alert for Query Targeting: Scanned Objects / Returned has gone above 1000 does not refer to a single query.

The MongoDB server keeps a counter (64-bit?) that tracks the number of documents examined since the server was start, and another counter for the number of documents returned.

That scanned per returned ration is derive by simply dividing the examined counter by the returned counter.

This means that if you have something like a count query that requires examining documents, you may have hundreds or thousands of documents examined, but only 1 returned. It won't take many of these kinds of queries to push the ratio over the 1000 alert limit

Joe
  • 25,000
  • 3
  • 22
  • 44
  • Thank you very much for your answer. Per ESR rule, we create one new index `{"action_key" : 1,"adjust_sent" : 1,"facebook_sent" : 1,"updated_at" : 1}`, this index could be used by the query – zangw Sep 28 '21 at 08:40