0

I am running a simple Django app on top of MongoDB, and recently upgraded to PyMongo 3.0.2 -- but this runs incredibly slowly. If I downgrade to PyMongo 2.8.1 or 2.7.2, it speeds back up again. This happens with both MongoDB 3 and 2.6, so I'm thinking that something fundamental has changed. Per the changelog, PyMongo 3 is actually supposed to speed up a lot, and I can't find any obvious change that would cause a slowdown in performance. I have found no related issues on SO or Google. This is on Django 1.6.4 and Python 2.7.5.

It's hard to put up a single code example of this, but we are using a single MongoDB instance (no sharding, no remote hosts), and in each of our methods that uses the mongo_client, we call close() at the end of the method. Is there some new connection re-opening behavior that might slow down the client, if we continuously close the connection? Example method below:

    from pymongo import MongoClient
    mongo_client = MongoClient()        
    collection = mongo_client[self._db_prefix + 'assessment']['Assessment']
    if collection.find({'itemIds': str(item_id)}).count() != 0:
        raise errors.IllegalState('this Item is being used in one or more Assessments')
    collection = mongo_client[self._db_prefix + 'assessment']['Item']
    item_map = collection.find_one({'_id': ObjectId(item_id.get_identifier())})
    if item_map is None:
        raise errors.NotFound()
    objects.Item(item_map, db_prefix=self._db_prefix, runtime=self._runtime)._delete()
    delete_result = collection.delete_one({'_id': ObjectId(item_id.get_identifier())})
    if delete_result.deleted_count == 0:
        raise errors.NotFound()
    mongo_client.close()

Update 1:

As suggested, I created a dedicated load test with the timeit library. Using PyMongo 3.0.2:

timeit.timeit('MongoClient()["test_blah"]["blah"].insert_one({"foo":"bar"})', number=10000, setup="from pymongo import MongoClient")

Actually throws an error:

  File "~/Documents/virtual_environments/assessments/lib/python2.7/site-packages/pymongo/pool.py", line 58, in _raise_connection_failure
raise AutoReconnect(msg)
AutoReconnect: localhost:27017: [Errno 49] Can't assign requested address

I then downgrade to PyMongo 2.8.1:

pip install pymongo==2.8.1

And run the same command in a python shell:

timeit.timeit('MongoClient()["test_blah"]["blah"].insert({"foo":"bar"})', number=10000, setup="from pymongo import MongoClient")
8.372910976409912

This time it actually finishes... So it seems like the new insert_one method does something different, where it isn't closing connections?

Update 2 (with solution):

Bernie's answer helped point us in the right direction, as well as this SO question. In addition to using a single MongoClient(), our problem was that we were closing the connection at the end of each method. Example timeits below (both PyMongo 3.0.2):

>>> timeit.timeit('client["test_blah"]["blah"].insert_one({"foo":"bar"}); client.close()', number=10, setup="from pymongo import MongoClient; client=MongoClient()")
4.520946025848389
>>> timeit.timeit('client["test_blah"]["blah"].insert_one({"foo":"bar"})', number=10, setup="from pymongo import MongoClient; client=MongoClient()")
0.004940986633300781

Manually closing the client is a performance killer...1000x slower. Perhaps caused by the slow monitor thread closing, that Bernie mentioned?

Community
  • 1
  • 1
user
  • 4,651
  • 5
  • 32
  • 60
  • Well, what version does your `mongod` have? Do you test matching drivers and version? And did you check the [driver compatibility matrix](http://docs.mongodb.org/ecosystem/drivers/python/)? – Markus W Mahlberg May 19 '15 at 19:33
  • Also, how do you measure the speed? – Markus W Mahlberg May 19 '15 at 19:39
  • mongod now shows v3.0.3. Yes, we checked the driver compatibility matrix, pymongo 2.8 and 3.0.2 should work with Mongo 3.0.3... – user May 19 '15 at 19:48
  • Running Django unittests, it's fairly obvious that the tests take longer to run (or even basic Python unittests). An order of magnitude longer. – user May 19 '15 at 19:49
  • Unit Tests are a very bad indicator. It might well be that load up times of the driver increased. You should run dedicated load tests to determine performance. – Markus W Mahlberg May 19 '15 at 20:32
  • Okay, have run some with timeit and will update in the original question – user May 19 '15 at 20:51

1 Answers1

2

I think the problem you are seeing is due to MongoClient spawning a background monitoring thread. This is new in PyMongo 3.0 and matches the behavior of MongoReplicaSetClient in PyMongo 2.x. You should be able to speed things up a lot by only spawning one instance of MongoClient (this is the preferred way to use MongoClient).

>>> import timeit
>>> timeit.timeit('client["test_blah"]["blah"].insert_one({"foo":"bar"})', number=10000, setup="from pymongo import MongoClient; client = MongoClient()")
2.2610740661621094
>>> import pymongo
>>> pymongo.version
'3.0.2'

>>> timeit.timeit('client["test_blah"]["blah"].insert({"foo":"bar"})', number=10000, setup="from pymongo import MongoClient; client = MongoClient()")
2.3010458946228027
>>> import pymongo
>>> pymongo.version
'2.8.1'

I also think that it's taking too long for the monitor thread to shut down and will be looking into a fix for that.

Bernie Hackett
  • 8,749
  • 1
  • 27
  • 20
  • Thanks Bernie! This helped point us to the right direction -- partially what you said, but also we were calling client.close() at the end of each method...removing that sped everything up a lot! I will give you the answer and put in a more detailed explanation in the original question... – user May 20 '15 at 13:16
  • Calling client.close() is definitely a perf killer. It destroys the connection pool and shuts down (seemingly slowly currently) the topology monitor. Your application should create one instance of MongoClient and use it throughout to take advantage of the built in connection pooling. Creating and destroying a MongoClient for every request is very expensive and becomes even more expensive if you add in authentication and TLS. See also - http://api.mongodb.org/python/current/faq.html#how-does-connection-pooling-work-in-pymongo – Bernie Hackett May 20 '15 at 22:39