0

My dilemma is that mongodb $text searches must match an exact word: e.g. if trying to match a post with 'testing123' a search for 'test' will not match, but using $regex will match. Now, I want to make use of indexes too, but I also want partial matches.

My thinking is that I could combine them with an $or operator, but it is not working. Is this possible? Each query alone inside the $or work, but when combining them I get no matches.

If this is not possible, I have found a pretty good solution, here , but I would like the combined $or to work if possible, but any other suggestions are welcome.

const posts = await Post.find({
  name: { $regex: 'foo', $options: 'i' },
  $or: [
    { $text: { $search: text, $caseSensitive: false } },
    { text: { $regex: text, $options: 'i' } },
  ],
});
ram
  • 680
  • 5
  • 15

1 Answers1

1

One way of doing this is to downcase the text into another field, then use $regex search on that field.


You have text that you want to search for any substring case insensitively:

MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.insert({foo:'hello world TESTING123'})
WriteResult({ "nInserted" : 1 })

Step 1: add another field which stores the text in lower case.

MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.insert({foo:'hello world TESTING123',foo_lower:'hello world testing123'})

Step 2: add index.

MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.createIndex({foo_lower:1})
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 2,
    "numIndexesAfter" : 3,
    "commitQuorum" : "votingMembers",
    "ok" : 1,
    "$clusterTime" : {
        "clusterTime" : Timestamp(1597711723, 7),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    },
    "operationTime" : Timestamp(1597711723, 7)
}

Step 3: downcase the query to "testing123"

Step 4: use $regex.

MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.find({foo_lower:{$regex:'testing123'}})
{ "_id" : ObjectId("5f3b2498f885e53d90f30979"), "foo" : "hello world TESTING123", "foo_lower" : "hello world testing123" }
MongoDB Enterprise ruby-driver-rs:PRIMARY> db.foo.find({foo_lower:{$regex:'testing123'}}).explain()
{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "test.foo",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "foo_lower" : {
                "$regex" : "testing123"
            }
        },
        "queryHash" : "0D14CC56",
        "planCacheKey" : "1974A2D4",
        "winningPlan" : {
            "stage" : "FETCH",
            "inputStage" : {
                "stage" : "IXSCAN",
                "filter" : {
                    "foo_lower" : {
                        "$regex" : "testing123"
                    }
                },
                "keyPattern" : {
                    "foo_lower" : 1
                },
                "indexName" : "foo_lower_1",
                "isMultiKey" : false,
                "multiKeyPaths" : {
                    "foo_lower" : [ ]
                },
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                    "foo_lower" : [
                        "[\"\", {})",
                        "[/testing123/, /testing123/]"
                    ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "serverInfo" : {
        "host" : "serene",
        "port" : 14420,
        "version" : "4.4.0",
        "gitVersion" : "563487e100c4215e2dce98d0af2a6a5a2d67c5cf"
    },
    "ok" : 1,
    "$clusterTime" : {
        "clusterTime" : Timestamp(1597711761, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    },
    "operationTime" : Timestamp(1597711761, 1)
}
D. SM
  • 13,584
  • 3
  • 12
  • 21
  • Could you give me a small example of what you mean? – ram Aug 17 '20 at 04:30
  • Case-insensitive regexp matching does not support indexes, so you create another field where the text is downcased that you can search in a case-sensitive fashion. – D. SM Aug 17 '20 at 04:54
  • how would this solve the issue with partial searches? – ram Aug 17 '20 at 05:55
  • What does a partial search for exact word mean? – D. SM Aug 17 '20 at 14:45
  • If I'm trying to match a post that has *testing123* and I only search for *test*, the $text $search will not match, but using a $regex will match. This is why I want to combine the two. – ram Aug 17 '20 at 20:31
  • My answer provides a $regex solution which fulfills that requirement. – D. SM Aug 17 '20 at 22:18
  • Yeah, but how to do I combine that with the $search? If you just add them as regular field queries together, it won't work. – ram Aug 17 '20 at 22:23
  • Can you just give me a code example of what you're trying to say because I'm not getting it? I want both $regex and $search in the same query. Adding them together in an $or statement doesn't work and adding them both as their own field doesn't work. I'm assuming that there is no way to do this, but if you know please let me know. – ram Aug 18 '20 at 00:36
  • You seem to be posing an XY problem. The answer I gave satisfies all of the requirements in your question (case insensitive search, any substring matches). If you have other requirements add them to the question. Using both $regex and $search is not a requirement. – D. SM Aug 18 '20 at 00:40
  • If you would like an answer to include code, start by writing the code in mongo shell that inserts sample data and performs the query, add that to the question. – D. SM Aug 18 '20 at 00:41