0

Let's say I have class User which inherits from the Document class (I am using Mongoengine). Now, I want to retrieve all users signed up after some timestamp. Here is the method I am using:

def get_users(cls, start_timestamp):
    return cls.objects(ts__gte=start_timestamp)

1000 documents are returned in 3 seconds. This is extremely slow. I have done similar queries in SQL in a couple of miliseconds. I am new to MongoDB and No-SQL in general, so I guess I am doing something terribly wrong.

I suspect the retrieval is slow because it is done in several batches. I read somewhere that for PyMongo the batch size is 101, but I do not know if that is same for Mongoengine.

Can I change the batch size, so I could get all documents at once. I will know approximately how much data will be retrieved in total.

Any other suggestions are very welcome.

Thank you!

giliev
  • 2,938
  • 4
  • 27
  • 47

1 Answers1

3

As you suggest there is no way that it should take 3 seconds to run this query. However, the issue is not going to be the performance of the pymongo driver, some things to consider:

  • Make sure that the ts field is included in the indexes for the user collection
  • Mongoengine does some aggressive de-referencing so if the 1000 returned user documents have one or more ReferenceField then each of those results in additional queries. There are ways to avoid this.
  • Mongoengine provides a direct interface to the pymongo method for the mongodb aggregation framework this is by far the most efficient way to query mongodb
  • mongodb recently released an official python ODM pymodm in part to provide better default performance than mongoengine
Steve Rossiter
  • 2,624
  • 21
  • 29
  • I have already improved the performance using direct interface to pymongo. It was really surprising how much the speed improved, going from 3 seconds to 300 miliseconds. Is there something wrong the way I am using Mongoengine in the code snippet in the question? I do not see any value in using Mongoengine if I need to write pymongo instead. – giliev Sep 05 '16 at 09:24
  • [This answer](http://stackoverflow.com/questions/35257305/mongoengine-is-very-slow-on-large-documents-comapred-to-native-pymongo-usage/35274930#35274930) may be part of the problem. – Steve Rossiter Sep 05 '16 at 09:29
  • Thank you for the great answers! – giliev Sep 05 '16 at 10:31