1

I have a collection named test with the below data:

> db.test.find()
{ "_id" : ObjectId("5ae3494a5daab479a87f51fb"), "a" : "a6", "b" : "b6", "c" : "c6", "__key" : "default-domain:admin:vn1;c8" }
{ "_id" : ObjectId("5ae349645daab479a87f51fc"), "a" : "a7", "b" : "b7", "c" : "c7", "__key" : "default-domain:admin:vn2;c9" }
{ "_id" : ObjectId("5ae349af5daab479a87f51fd"), "a" : "a0", "b" : "b0", "c" : "c0", "__key" : "a0;b0;c0" }
{ "_id" : ObjectId("5ae349be5daab479a87f51fe"), "a" : "a1", "b" : "b1", "c" : "c1", "__key" : "a1;b1;c1" }
{ "_id" : ObjectId("5ae349cc5daab479a87f51ff"), "a" : "a2", "b" : "b1", "c" : "c2", "__key" : "a2;b2;c2" }
{ "_id" : ObjectId("5ae349d75daab479a87f5200"), "a" : "a3", "b" : "b2", "c" : "c3", "__key" : "a3;b3;c3" }
{ "_id" : ObjectId("5ae34b6c5daab479a87f5201"), "a" : "a8", "b" : "b8", "c" : "c9", "__key" : "default-domain:vn9;ch9" }
> 

I have set index as below:

db.test.createIndex({__key: "text"})

Now, I want to search string with keys having default-domain:*c8

> db.test.find({$text: {$search: "/default-domain:*c8/"}})
{ "_id" : ObjectId("5ae3494a5daab479a87f51fb"), "a" : "a6", "b" : "b6", "c" : "c6", "__key" : "default-domain:admin:vn1;c8" }
{ "_id" : ObjectId("5ae34b6c5daab479a87f5201"), "a" : "a8", "b" : "b8", "c" : "c9", "__key" : "default-domain:vn9;ch9" }
{ "_id" : ObjectId("5ae349645daab479a87f51fc"), "a" : "a7", "b" : "b7", "c" : "c7", "__key" : "default-domain:admin:vn2;c9" }
> 

So it returns wrong data, I was expecting to return only

{ "_id" : ObjectId("5ae3494a5daab479a87f51fb"), "a" : "a6", "b" : "b6", "c" : "c6", "__key" : "default-domain:admin:vn1;c8" }

I see from explain()

    "winningPlan" : {
        "stage" : "TEXT",
        "indexPrefix" : {

        },
        "indexName" : "__key_text",
        "parsedTextQuery" : {
            "terms" : [
                "c8",
                "default",
                "domain"
            ],
            "negatedTerms" : [ ],
            "phrases" : [ ],
            "negatedPhrases" : [ ]
        },

So here, internally it is converted to 3 words:

            "terms" : [
                "c8",
                "default",
                "domain"
            ],

I think that is why it is returning wrong data.

So, how I can achieve this using text based index: db.test.find({$text: {$search: "??"}}) Is the search expression wrong?

With regards, -M-

u_peerless
  • 644
  • 2
  • 9
  • 23

1 Answers1

0

The text index is behaving as expected as it tokenizes and stems the term within the index. This explains why the search term is split up into three separate words in your explain plan.

See https://docs.mongodb.com/manual/core/index-text/#tokenization-delimiters for tokenization, and https://docs.mongodb.com/manual/core/index-text/#index-entries for stemming and stop words.

If you require "default-domain" to be at the beginning when querying for "c8", then you may wish to consider a case-sensitive prefix expression https://docs.mongodb.com/manual/reference/operator/query/regex/#index-use and the use of a "$" to capture "c8" at the end of your regex http://grainge.org/pages/authoring/regex/regular_expressions.htm.

Alternatively, you might parse out the values from the "_key" field with to store the relevant data and query directly for the necessary values.

  • “While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes.” – Neil Lunn Apr 28 '18 at 01:18