0

I have a list of collections, and I would like to export all the data from MongoDB (essentially a database dump, but only for certain collections).

I'm using pymongo - is there a fast dump function? say a bulk_find()

Alternatively, I could use this solution, which is sub-optimal:

[d for d in db.collection.find()]

for each collection

vgoklani
  • 10,685
  • 16
  • 63
  • 101
  • Not pymongo specific, but have you looked at [`mongoexport`](https://docs.mongodb.com/manual/reference/program/mongoexport/index.html)? – JohnnyHK Apr 23 '18 at 20:43
  • Or [`mongodump`](https://docs.mongodb.com/manual/reference/program/mongodump/index.html) if you don't need a human readable format. – clcto Apr 23 '18 at 20:46
  • thanks for the suggestions. I was looking for the analogue to insert_many, which is based off a bulk find. – vgoklani Apr 23 '18 at 20:51
  • 1
    You can call [`batch_size`](http://api.mongodb.com/python/current/api/pymongo/cursor.html#pymongo.cursor.Cursor.batch_size) on the cursor returned from `find` to effectively perform bulk reads of whatever size you want. – JohnnyHK Apr 23 '18 at 23:00
  • 1
    You probably should really read that `batch_size` option as the most MongoDB can include in a single "batch" is 16MB. This is because ANY response is still effectively a BSON document, and subject to the same restrictions. Various drivers have a `.toArray()` method which simply implements an iterator much like you are using to extract as a list. But pymongo has a much more simple way to extract as a list. Anything else is a "stream", and that's still going to need an iterator to feed it. – Neil Lunn Apr 24 '18 at 05:10

0 Answers0