0

We are working with django + mongo(version 3.2) + django-rest-framework-mongoengine for one of our projects. For every string field with unique=True, UniqueValidator(of django-rest-framework-mongoengine) hits a query of {field_name: ^value$} form. This query is taking significantly more time as compared to {field_name: "value"} query. On checking the execution plan(in executionStats mode) there is huge difference between number of keysExamined in both cases. Just want to understand the reason behind it. Any help is appreciated.

Edit: Please consider value as the exact string eg: {field_name: "coffee"} vs {field_name: /^coffee$/}. I'm not able to understand why is there so much difference in performance. Regex search is not needed it's just that mongoengine is making a regex for checking exact match

bazinga012
  • 713
  • 1
  • 5
  • 10
  • Also quoting the manual for [`$regex`](https://webcache.googleusercontent.com/search?q=cache:zURZidCBhTUJ:https://docs.mongodb.com/manual/reference/operator/query/regex+&cd=1&hl=en&ct=clnk&gl=au) ( using cached copy right now for some reason!) *"Additionally, while `/^a/`, `/^a.*/`, and `/^a.*$/` match equivalent strings, they have different performance characteristics. All of these expressions use an index if an appropriate index exists; however, `/^a.*/`, and `/^a.*$/` are slower. `/^a/` can stop scanning after matching the prefix."* – Neil Lunn Apr 15 '19 at 09:25
  • 1
    In short `^value` is fine since it's the **prefix** and `value` is an **equality**. But `^value$` will force a scan, which is **bad**. – Neil Lunn Apr 15 '19 at 09:26
  • I guess `{field_name: ^value$}` is a regex `{field_name: /^value$/}`. If the value is something like "ca.*fe" the first query will return only objects with field_name equal to "ca.*fe", and the later will also include documents with field_name equal to "caffe", "catastrofe", and everything in between. Also bear in mind that validating uniqueness with find query is prone to race conditions. It is recommended to rely on unique indexes instead. – Alex Blex Apr 15 '19 at 09:26
  • Please consider value as the exact string eg: {field_name: "coffee"} vs {field_name: /^coffee$/}. I'm not able to understand why is there so much difference in performance. Regex search is not needed it's just that mongoengine is making a regex for checking exact match. – bazinga012 Apr 15 '19 at 09:34
  • Nonetheless, it's actually doing a `$regex` search. If you want **exact** searches then you actually can send raw queries. But the point is that with such a regular expression with a `$` at the end, then query performance will not be great, just as described in the manual section quoted. – Neil Lunn Apr 15 '19 at 09:39

0 Answers0