152

I am trying to perform a regex query using PyMongo against a MongoDB server. The document structure is as follows

{
  "files": [
    "File 1",
    "File 2",
    "File 3",
    "File 4"
  ],
  "rootFolder": "/Location/Of/Files"
}

I want to get all the files that match the pattern *File. I tried doing this as such

db.collectionName.find({'files':'/^File/'})

Yet I get nothing back. Am I missing something, because according to the MongoDB docs this should be possible? If I perform the query in the Mongo console it works fine, does this mean the API doesn't support it or am I just using it incorrectly?

Null
  • 1,950
  • 9
  • 30
  • 33
RC1140
  • 8,423
  • 14
  • 48
  • 71

4 Answers4

212

If you want to include regular expression options (such as ignore case), try this:

import re
regx = re.compile("^foo", re.IGNORECASE)
db.users.find_one({"files": regx})
Mirzhan Irkegulov
  • 17,660
  • 12
  • 105
  • 166
Eric
  • 2,389
  • 1
  • 14
  • 9
  • 8
    Note also that regex's anchored at the start (ie: starting with `^`) are able to use indexes in the db, and will run much faster in that case. – drevicko Aug 13 '13 at 23:31
  • 1
    Regex's starting with ^ can only use an index in [certain cases](http://docs.mongodb.org/manual/reference/operator/query/regex/). When using re.IGNORECASE I believe mongo can't use an index to perform the query. – nonagon Apr 08 '15 at 18:08
  • Is this usage documented somewhere? I can't find this in the official pymongo API doc. – Hieu Oct 16 '17 at 22:38
182

Turns out regex searches are done a little differently in pymongo but is just as easy.

Regex is done as follows :

db.collectionname.find({'files':{'$regex':'^File'}})

This will match all documents that have a files property that has a item within that starts with File

RC1140
  • 8,423
  • 14
  • 48
  • 71
  • 9
    Actually, what you have here is also the [way it's done in javascript](http://docs.mongodb.org/manual/reference/operator/regex/) (and probably other languages too) if you use `$regex`. @Eric's answer is the python way that's a little different. – drevicko Aug 13 '13 at 23:33
  • what's the difference? They're both using python pymongo correct? It is part of mongodb queries so I don't see the issue really. – Dexter Dec 22 '14 at 18:40
  • 14
    Ignorecase is possible in regex of mongodb JScript also viz. db.collectionname.find({'files':{'$regex':'^File','$options':'i'}}) – Ajay Gupta Apr 25 '15 at 10:37
  • 7
    This answer looks better to my eyes. Why bother compiling a Python RE if you're just going to stringify it so that Mongo can compile it again? Mongo's `$regex` operator takes an `$options` argument. – Mark E. Haase May 16 '15 at 15:56
  • 3
    Please use `r'^File'` instead of `'^File'` to avoid other problem – Aminah Nuraini Dec 02 '15 at 13:33
  • Thanks for this answer. I tried to do what I thought would be a simple expansion of this to return all files starting with the letters A-F as follows – Richard B Aug 17 '16 at 14:43
  • It's also worth noting that it's still possible to use variables in the regular expression when compiled this way: `db.collectionname.find({'files':{'$regex':'^{}'.format(myVar)}})` – Andrew Kirk Jun 09 '17 at 20:57
  • someone answer my question https://stackoverflow.com/questions/49843914/how-to-return-documents-which-are-contains-the-specific-keywords-in-the-keys-fro?noredirect=1#comment86702821_49843914 – Pyd Apr 15 '18 at 16:46
16

To avoid the double compilation you can use the bson regex wrapper that comes with PyMongo:

>>> regx = bson.regex.Regex('^foo')
>>> db.users.find_one({"files": regx})

Regex just stores the string without trying to compile it, so find_one can then detect the argument as a 'Regex' type and form the appropriate Mongo query.

I feel this way is slightly more Pythonic than the other top answer, e.g.:

>>> db.collectionname.find({'files':{'$regex':'^File'}})

It's worth reading up on the bson Regex documentation if you plan to use regex queries because there are some caveats.

Keeely
  • 895
  • 9
  • 21
  • 4
    If you need to match agains an array using $in then $regex would not work for you. `bson.regex.Regex ` will do the trick! – odedfos Jul 04 '18 at 13:27
9

The solution of re doesn't use the index at all. You should use commands like:

db.collectionname.find({'files':{'$regex':'^File'}})

( I cannot comment below their replies, so I reply here )

Jeff
  • 181
  • 2
  • 2