Related to my question from earlier today. This is also a related post on the topic. We've built a Live Search that fetches the top 20 responses from a collection of 250K names, and we want to do data-fetching correctly.
Currently, if I use:
db.collection.find({ "drug": { "$regex": "cols", "$options": "i" } })
then I get bombarded with email warnings from MongoDB Atlas saying Scanned Objects / Returned has gone above 1000
. This is because I am not using $search
, so I am not using the text index. Each query seems to be scanning the whole 250K rows to get the best 20 matches. Unfortunately, if I use this:
db.collection.find({ $text: { $search: "dog cat" } })
While I do not get bombarded by emails, the search results are not good because they do not capture partial strings... For example, if I search for the basketball player Zion Williamson
, I get no results when the partial string Zion Williams
is typed in... with regex, it correctly returns Zion Williamson
.
Is it problematic to stick with the regex
approach, and ignore these email warnings? Until mongo's $search
is better at capturing partial strings, I don't want to use it in my live search. Perhaps it is possible to turn off the email alerts for this particular warning for this particular table only?
Thanks in advance for any thoughts on this!
Edit: The collection in question is fairly small (16MB), with ~250K documents and 5 values in each document. Also, the performance of both $regex
and $search
are sufficient (~0.1 seconds) - the full table scan with $regex
is not hurting the data-fetching performance too much.