I have a mongodb database with 10-15 million entries. For each of them, I have to update a field that initially does not exist. Assuming that the application crashed due to an unexpected server shutdown, how is it best to update the remaining entries?
Should I use field: {$exists: false}
and update those, or should I walk through the entire collection and check for each document if it has that field or not and if so, perform the update? My take on this is that since you can't associate an index with the existence of a field, $exists does basically the same thing. Which one would be faster and why?
Note that the value this field is gonna have is dependent on the other fields of a document so I can't do a multi: true update.
Solution: As @DhruvPathak and @Sammaye suggest, whilst indexes are associated to data, not the fields themselves (so you can't have an index linked to the existence of a field), $exist can take advantage of indexes on the documents where the fields exist and this greatly increases speed.
Additional: Although it's a bit of a side quest, I now know the reason for why the application crashed. The server timed out the cursor because it was being used for too long (given the size of the collection). This can be avoided by using batch_size
as explained here.