5

I have a python script that collects data everyday and inserts it into a MongoDB collection (~10M documents). Sometimes the job fails and I am left with partial data which is not useful to me. I would like to insert the data into a staging collection first and then copy or move all documents from the staging collection into the final collection only when the job finishes and the data is complete. I cannot seem to find a straight forward solution for doing this as a "bulk" type operation, but it seems there should be one.

In SQL it would be something like this:

INSERT INTO final_table
SELECT *
FROM staging_table

I thought that db.collection.copyTo() would work for this but it seems it makes the destination collection a clone of the source collection.

Additionally, I know from this: mongodb move documents from one collection to another collection that I can do something like the following:

var documentsToMove = db.collectionA.find({});
documentsToMove.forEach(function(doc) {
    db.collectionB.insert(doc);
}

But it seems like there should be a more efficient way.

So, How can I take all documents from one collection and insert them into another collection in the most efficient manner?

NOTE: the final collection has data in it already. The new documents that I want to move over would be adding to this data, e.g if my staging collection has 2 documents and my final collection has 10 documents, I would have 12 documents in my final collection after I move the staging data over.

Community
  • 1
  • 1
Andrew Marshall
  • 155
  • 1
  • 6

2 Answers2

0

You can use db.cloneCollection(); see mondb cloneCollection

Eric
  • 9,870
  • 14
  • 66
  • 102
0

if you no longer need the staging collection you can simply use the renaming option.

 switch to admin db
db.runCommand({renameCollection:"staging.CollectionA",to:"targetdb.CollectionB"})
KDP
  • 1,481
  • 7
  • 13
  • I want to move the documents from one collection into an existing collection that already has data in it. I.e. I want to append one collection to another. I have updated the question to clarify this. – Andrew Marshall Jul 08 '15 at 17:18