,My scenario is that I have a collection consisting of many documents to be processed - one document at a time. It takes a relatively long time to process a document, and it will take many hours to process the whole collection. Therefore, I will have multiple simultaneous 'workers' processing the same collection. Each needs to do something like,
(A) get the next unprocessed document,
(B) process it,
(C) mark the document as processed, and continue.
How do I ensure that the simultaneous processes do not read the same documents? I do not know what the key values will be, so I can't say something like process_A should start at 1 and process_B start at a million. Also I would like to add as many processes as manageable, so it is not practical to say one go forwards and another go backwards.
I ask about MongoDB because that is what I am using. I imagine the same question could be asked about a SQL database.
I implore anyone who wants to help, not to focus on changing the scenario, which for whatever external reasons, is a given.
Thank you