0

I have a document with many objects, and several aggregate processors that run on them. Lets say the name of the objects document is Objects. For one processor, I created another document called ProcessedObjects in which each instance is an object that contains one field "processedObjectPtr" which is a link to an object.

I would like to run the following basic loop :

for all objects that haven't been processed yet: 
    1 - process object
    2 - add object to processed object list

The part which I don't know how to do in MongoDB is to get the list of objects that haven't been processed yet. Theoretically I could mark the object itself as processed by adding another field to it, but when I will have many processors that will get ugly, which is why I prefer to keep the 'processed object list' in a separate document.

Is there an elegant way to do this, or will I have to add the processed metadata to the actual objects? I am using mongoengine, but any answer will do.

Thanks!

Noam
  • 1,881
  • 1
  • 12
  • 20
  • You should try the Maps to acheive this . See the example here http://stackoverflow.com/questions/8772936/get-data-from-collection-b-not-in-collection-a-in-a-mongodb-shell-query – Binish Mookken Jan 15 '14 at 13:55
  • 1
    I ended up adding a 'processedFlags' list to my objects, so each processor could mark (via a flag) that it already processed this object, and also query which objects do not have the flag. – Noam Jan 20 '14 at 10:22
  • @Noam: Since you've solved your own question, I think you can also post this as an answer ;-) – Stennie Jan 22 '14 at 07:40

1 Answers1

0

I ended up adding a 'processedFlags' list to my objects, so each processor could mark (via a flag) that it already processed this object, and also query which objects do not have the flag.

Noam
  • 1,881
  • 1
  • 12
  • 20