16

I'm using MongoDB, and I have a collection of documents with the following structure:

{
    fName:"Foo",
    lName:"Barius",
    email:"fbarius@example.com",
    search:"foo barius"
}

I am building a function that will perform a regular expression search on the search field. To optimize performance, I have indexed this collection on the search field. However, things are still a bit slow. So I ran an explain() on a sample query:

db.Collection.find({search:/bar/}).explain();

Looking under the winning plan, I see the following index bounds used:

"search": [
        "[\"\", {})",
        "[/.*bar.*/, /.*bar.*/]"
]

The second set makes sense - it's looking from anything that contains bar to anything that contains bar. However, the first set baffles me. It appears to be looking in the bounds of "" inclusive to {} exclusive. I'm concerned that this extra set of bounds is slowing down my query. Is it necessary to keep? If it's not, how can I prevent it from being included?

Chumblechops
  • 328
  • 1
  • 10
  • Has the same problem, did you found an explanation? – kirhgoff Sep 08 '17 at 05:04
  • @kirhgoff Which version of mongoDB are you using? – barbakini Sep 08 '17 at 06:51
  • @kirhgoff what are you using `mongoDB native` or `mongoose`. Check this out - http://voidcanvas.com/mongoose-vs-mongodb-native/ – Utkarsh Dubey Sep 08 '17 at 07:16
  • It does not depend on the version of mongoDB as we tested, it depend on ruby gem version we are using - mongoid. If the version is 4.xx (which uses moped behind the scenes) there are no empty brackets, if the version is 5.2 (which uses mongo driver) we observe this weird line in the query plan – kirhgoff Sep 08 '17 at 23:29
  • 3
    Not sure how this question slipped by, but it's the same as [MongoDB - interpret particular explain output](https://stackoverflow.com/q/22643629/2313887) – Neil Lunn Sep 10 '17 at 22:00

3 Answers3

5

I think it's just the way mongodb works with regex (see https://scalegrid.io/blog/mongodb-regular-expressions-indexes-performance/). Just watch out for nscanned/totalKeysExamined value, if it's too large then the index is useless for your query.

See also: MongoDB, performance of query by regular expression on indexed fields

blubear
  • 431
  • 4
  • 12
  • Agree, as stated in the MongoDB documentation and in https://stackoverflow.com/a/33219393/8291949 if your regex is not a "prefix expression" mongo will full scan the keys in the index then will fetch the matched documents (which should be still faster than a full collection scan). – wp78de Sep 12 '17 at 21:57
1

This is the way mongo works with this type of regex and an index. What I mean is that you are searching for /bar/ instead of /^bar/.

When you specify an index on that field, it is indexing from the first character. So "Foo barius" is indexed beginning with F. Since you are searching for "bar" anywhere in the field you have to search the entire index on that field looking *bar*.

The first line in your explain says look at every record in the index.

The second line say, give me only those indices from (1) that have bar in them.

Bottom line: Design your records so they use the index efficiently. In the case of strings, make sure your searches are at the beginning of the string, e.g., /^bar/. If I'm going to search by last name then it needs to occur first in an indexed field.

As an exercise do an explain on /^bar/ instead. You won't get your data, but the first index bounds will be something like /^bar/ to /^bas/.

I hope my stream of consciousness answer is helpful.

UDude

uDude
  • 2,281
  • 2
  • 14
  • 9
0

Thought I'd add my two cents.

The previous two answers are correct. The regex expression can only use an standard index if you start your search from the beginning. Actually, having an index and searching by regex can have a detrimental effect on your search because it attempts to use the index but wont be successful.

There is another type of index that may be useful in your situation. Mongo's text index. It indexes each word based on spaces, so it would be able to do an indexed search on both the words "foo" and "barius", which might be more use

Here's the docs for that: https://docs.mongodb.com/manual/core/index-text/

bgraham
  • 1,939
  • 1
  • 10
  • 17