0

I am updating ~600,000 documents in a mongo collection using some PyMongo code that looks like this

bulk = coll.initialize_ordered_bulk_op()

for index, row in df.iterrows():

    bulk.find({ '_id': ObjectId(row['id']) }).update({ '$set': { "X": 
    row['X'].split(',') } })

bulk.execute()

After some further investigating I though this might fail for >100,000 documents and that I would have to do something like what is suggested here.

However it works fine on all documents. I am just curious to know what I have misunderstood.

Thanks in advance.

1 Answers1

0

As mentioned in the documentation here :

Each group of operations can have at most 1000 operations. If a group exceeds this limit, MongoDB will divide the group into smaller groups of 1000 or less. For example, if the bulk operations list consists of 2000 insert operations, MongoDB creates 2 groups, each with 1000 operations.

So mainly you are not actually executing 600000 operation in the some time but mongoDB is taking care of splitting those operations ...

user3375448
  • 695
  • 6
  • 14