3

What is a proper way to duplicate a collection in Mongodb on the same server using C#? MongoVUE has an option 'Duplicate collection', is there something similar for C#?

nvoigt
  • 75,013
  • 26
  • 93
  • 142
wuyts
  • 75
  • 1
  • 6
  • possible duplicate of [How can I copy collection to another database in MongoDB?](http://stackoverflow.com/questions/11554762/how-can-i-copy-collection-to-another-database-in-mongodb) – Philipp Oct 21 '13 at 12:16
  • That post is about copying to another database. I want to copy my collection in the same database (just a duplicate of the collecion) – wuyts Oct 21 '13 at 12:20
  • check the accepted answer of this one http://stackoverflow.com/questions/8933307/clone-a-collection-in-mongodb – joao Oct 21 '13 at 12:24
  • @wuyts the same procedure applies when you copy a collection within the same database. – Philipp Oct 21 '13 at 12:26
  • @joao that is for actions in shell, not in c# – wuyts Oct 21 '13 at 12:27

2 Answers2

6

There isn't a built-in way to copy collections with the C# driver, but you can still do it pretty simply as:

var source = db.GetCollection("test");
var dest = db.GetCollection("testcopy");
dest.InsertBatch(source.FindAll());

Note, however, that this won't copy any indexes from the source collection. The shell's copyTo method has the same limitation so it's likely implemented similarly.

JohnnyHK
  • 305,182
  • 66
  • 621
  • 471
  • MongoVUE duplicate collection also does not copy any index. So its very likely that they do it this way too. – leojg Oct 21 '13 at 17:13
  • What do you mean by 'won't copy any indexes'? – wuyts Oct 22 '13 at 12:32
  • @wuyts Any indexes that have been created on the source collection won't exist on the copy. You'd have to recreate those on the copy either manually or using separate code. – JohnnyHK Oct 22 '13 at 12:49
5

I had the exact same problem, but while the accepted answer works, I also needed to make it as fast as possible.

The fastest way to copy a collection is apparently using an aggregate with an $out pipeline stage. This way, you won't have to download all the documents and then re-upload them, they are just copied inside the database.

This is trivial to execute inside the mongo shell:

db.originalColl.aggregate([ { $match: {} }, { $out: "resultColl"} ]);

However, I had a lot of trouble running this from C#. Since eval has now been deprecated, you can't just stuff the above in a string to be executed on the server. Instead you need to construct a Bson document that represents the above code.

Here's how I made it work:

var aggDoc = new Dictionary<string,object>
{
    {"aggregate" , "originalCollection"},
    {"pipeline", new []
        {
            new Dictionary<string, object> { { "$match" , new BsonDocument() }},
            new Dictionary<string, object> { { "$out" , "resultCollection"}}
        }
    }
};

var doc = new BsonDocument(aggDoc);
var command = new BsonDocumentCommand<BsonDocument>(doc);
db.RunCommand(command);

This turns out to be very fast (about 3 minutes to copy 5M documents), and no data is transferred between the db and the application running the above code. One drawback is that it creates a temporary collection, so the resultCollection will be empty (or not existing) until the operation completes. So if you have a progress bar that is based on the size of the resultCollection it will no longer work.

DukeOf1Cat
  • 1,087
  • 15
  • 34
  • I can add that I was able to construct a progress bar by looking for newly created collections, finding the temp collection that mongodb creates, and monitoring the size of that. However, if you have many of these operations going on in parallel, there is (to my knowledge) no way of seeing which temporary collection corresponds to this particular aggregation. I "solved" this with locking around the part that creates and finds the temp collection. It's sufficiently fast, since the actual aggregations are run concurrently. – DukeOf1Cat Nov 27 '18 at 13:13