I have a python script that collects data everyday and inserts it into a MongoDB collection (~10M documents). Sometimes the job fails and I am left with partial data which is not useful to me. I would like to insert the data into a staging collection first and then copy or move all documents from the staging collection into the final collection only when the job finishes and the data is complete. I cannot seem to find a straight forward solution for doing this as a "bulk" type operation, but it seems there should be one.
In SQL it would be something like this:
INSERT INTO final_table
SELECT *
FROM staging_table
I thought that db.collection.copyTo() would work for this but it seems it makes the destination collection a clone of the source collection.
Additionally, I know from this: mongodb move documents from one collection to another collection that I can do something like the following:
var documentsToMove = db.collectionA.find({});
documentsToMove.forEach(function(doc) {
db.collectionB.insert(doc);
}
But it seems like there should be a more efficient way.
So, How can I take all documents from one collection and insert them into another collection in the most efficient manner?
NOTE: the final collection has data in it already. The new documents that I want to move over would be adding to this data, e.g if my staging collection has 2 documents and my final collection has 10 documents, I would have 12 documents in my final collection after I move the staging data over.