0

I have a list of news article items which I am tagging for entities, and topic tags.

my query

db["fmetadata"].find({'$and': [{'$text': {'$search': 'apple trump'}}, {'$or': 
[{'entities': {'$elemMatch': {'$regex': 'apple|trump'}}}, {'tags': {'$elemMatch': {'$regex': 'apple|trump'}}}]}]}).explain()

query plan

{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "dfabric.fmetadata",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "$and" : [
                                {
                                        "$or" : [
                                                {
                                                        "entities" : {
                                                                "$elemMatch" : {
                                                                        "$regex" : "apple|trump"
                                                                }
                                                        }
                                                },
                                                {
                                                        "tags" : {
                                                                "$elemMatch" : {
                                                                        "$regex" : "apple|trump"
                                                                }
                                                        }
                                                }
                                        ]
                                },
                                {
                                        "$text" : {
                                                "$search" : "apple trump",
                                                "$language" : "english",
                                                "$caseSensitive" : false,
                                                "$diacriticSensitive" : false
                                        }
                                }
                        ]
                },
                "winningPlan" : {
                        "stage" : "FETCH",
                        "filter" : {
                                "$or" : [
                                        {
                                                "entities" : {
                                                        "$elemMatch" : {
                                                                "$regex" : "apple|trump"
                                                        }
                                                }
                                        },
                                        {
                                                "tags" : {
                                                        "$elemMatch" : {
                                                                "$regex" : "apple|trump"
                                                        }
                                                }
                                        }
                                ]
                        },
                        "inputStage" : {      
                "stage" : "TEXT",
                                "indexPrefix" : {

                                },
                                "indexName" : "title_text_tags_text_entities_text",
                                "parsedTextQuery" : {
                                        "terms" : [
                                                "appl",
                                                "trump"
                                        ],
                                        "negatedTerms" : [ ],
                                        "phrases" : [ ],                                  
                    "negatedPhrases" : [ ]
                                },
                                "textIndexVersion" : 3,
                                "inputStage" : {
                                        "stage" : "TEXT_MATCH",
                                        "inputStage" : {
                                                "stage" : "FETCH",
                                                "inputStage" : {
                                                        "stage" : "OR",
                                                        "inputStages" : [
                                                                {
                                                                        "stage" : "IXSCAN",
                                                                        "keyPattern" : {
                                                                                "_fts" : "text",
                                                                                "_ftsx" : 1
                                                                        },
                                                                        "indexName" : "title_text_tags_text_entities_text",
                                                                        "isMultiKey" : true,
                                                                        "isUnique" : false,
                                                                        "isSparse" : false,
                                                                        "isPartial" : false,
                                                                        "indexVersion" : 2,
                                                                        "direction" : "backward",
                                                                        "indexBounds" : {

                                                                        }
                                                                },
                                                                {
                                                                        "stage" : "IXSCAN",
                                                                        "keyPattern" : {
                                                                                "_fts" : "text",
                                                                                "_ftsx" : 1
                                                                        },
                                                                        "indexName" : "title_text_tags_text_entities_text",
                                                                        "isMultiKey" : true,
                                                                        "isUnique" : false,
                                                                        "isSparse" : false,
                                                                        "isPartial" : false,
                                                                        "indexVersion" : 2,
                                                                        "direction" : "backward",
                                                                        "indexBounds" : {

                                                                        }
                                                                }
                                                        ]
                                                }
                                        }
                                }
                        }
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "fabric-dev",
                "port" : 27017,
                "version" : "4.0.2",
                "gitVersion" : "fc1573ba18aee42f97a3bb13b67af7d837826b47"
        },
        "ok" : 1
}

I see that

["queryPlanner"]["winningPlan"]["inputStage"]["inputStage"]["inputStages"]

"stage": "IXSCAN"
"direction": "backward"

Can this please be explained why?

I was developing a pagination cursor using >lastId, and limit technique. But since, results are being returned backwards, I have to use < lastId which seems counterintuitive.

If I don't sort my results in the natural order, can it be guaranteed that it will always be backwards/reverse?

Edit: as mentioned in the comment below My objective here is to get the intuition as to why the index was scanned backwards- is it the way I formulated my query? or something else entirely? The ordering- forwards or backwards doesn't matter as much as the consistency of it remaining always so does- either always forwards or vice versa

sudeepgupta90
  • 753
  • 1
  • 9
  • 17
  • You need to sort your results if you want deterministic ordering. – JohnnyHK Oct 11 '18 at 13:47
  • I have mentioned that already. My objective here is to get the intuition as to why the index was scanned backwards- is it the way I formulated my query? or something else entirely? The ordering- forwards or backwards doesn't matter as much as the consistency of it remaining always so does- either always forwards or vice verse. – sudeepgupta90 Oct 11 '18 at 13:56
  • If you don't sort, order cannot be guaranteed. Otherwise it's implementation dependent and may vary over time and MongoDB versions. Maybe someone could guess why its ordered the way it is without sorting, but I'm not sure what the value in that would be. – JohnnyHK Oct 11 '18 at 14:47

1 Answers1

0

I came across this question on stackoverflow, and I believe the accepted answer, with the comments below satisfactorily gives me the intuition I was looking for.

How does MongoDB sort records when no sort order is specified?

sudeepgupta90
  • 753
  • 1
  • 9
  • 17