I've got some code that examines every object in a Mongo collection (iterating over the result of a find() with no parameters), and makes changes to some of them. It seems that this isn't a safe thing to do: my changes are saved, but then when I continue iterating through the cursor, a subset of the changed objects (10-15%) show up a second time. I wasn't changing the document ID or anything that there's an index on.
I figure I could avoid this problem by grabbing all the document IDs ahead of time (convert the cursor to an array), but these are large collections so I'd really like to avoid that.
I noticed that the result of find() by default doesn't seem to have any defined order, so I tried putting an explicit sort on the cursor, {"_id":1}. This seems to have fixed the problem-- now nothing shows up twice no matter what I modify. But I don't know if that's a good/reliable approach. As far as I can tell from the documentation, adding a sort does not make it pre-query all the IDs; if so, that's nice, but then I don't know why it would fix the problem.
Is it just a bad idea to use cursors while changing stuff?
I'm using Scala/Casbah, if that matters.