73

I've been playing with MongoDB recently (It's AMAZINGLY FAST) using the C# driver on GitHub. Everything is working just fine in my little single threaded console app that I'm testing with. I'm able to add 1,000,000 documents (yes, million) in under 8 seconds running single threaded. I only get this performance if I use the connection outside the scope of a for loop. In other words, I'm keeping the connection open for each insert rather than connecting for each insert. Obviously that's contrived.

I thought I'd crank it up a notch to see how it works with multiple threads. I'm doing this because I need to simulate a website with multiple concurrent requests. I'm spinning up between 15 and 50 threads, still inserting a total of 150,000 documents in all cases. If I just let the threads run, each creating a new connection for each insert operation, the performance grinds to a halt.

Obviously I need to find a way to share, lock, or pool the connection. Therein lies the question. What's the best practice in terms of connecting to MongoDB? Should the connection be kept open for the life of the app (there is substantial latency opening and closing the TCP connection for each operation)?

Does anyone have any real world or production experience with MongoDB, and specifically the underlying connection?

Here is my threading sample using a static connection that's locked for insert operations. Please offer suggestions that would maximize performance and reliability in a web context!

private static Mongo _mongo;

private static void RunMongoThreaded()
{
    _mongo = new Mongo();
    _mongo.Connect();

    var threadFinishEvents = new List<EventWaitHandle>();

    for(var i = 0; i < 50; i++)
    {
        var threadFinish = new EventWaitHandle(false, EventResetMode.ManualReset);
        threadFinishEvents.Add(threadFinish);

        var thread = new Thread(delegate()
            {
                 RunMongoThread();
                 threadFinish.Set();
            });

        thread.Start();
    }

    WaitHandle.WaitAll(threadFinishEvents.ToArray());
    _mongo.Disconnect();
}

private static void RunMongoThread()
{
    for (var i = 0; i < 3000; i++)
    {
        var db = _mongo.getDB("Sample");
        var collection = db.GetCollection("Users");
        var user = GetUser(i);
        var document = new Document();
        document["FirstName"] = user.FirstName;
        document["LastName"] = user.LastName;

        lock (_mongo) // Lock the connection - not ideal for threading, but safe and seemingly fast
        {
            collection.Insert(document);
        }
    }
}
shA.t
  • 16,580
  • 5
  • 54
  • 111
Tyler Brinks
  • 1,201
  • 1
  • 14
  • 24
  • 3
    What did you decide on in the end? Facing the same issue... – Andrew Bullock Oct 12 '10 at 20:00
  • 4
    The good news is that I didn't have to decide. Mongodb-csharp and NoRM drivers both added support for connection pooling. Both libraries have well designed, thread safe mechanisms for pooling connections against a mongod or mongos process. Both area also adding replica set support in the near future. – Tyler Brinks Oct 13 '10 at 02:37
  • @TylerBrinks can you show an example of how you able to insert 1m documents under 8sec? I'm unable to reach that speed, on single thread. – hackp0int Jul 09 '13 at 05:58

6 Answers6

146

Most answers here are outdated and are no longer applicable as the .net driver has matured and had numberless features added.

Looking at the documentation of the new 2.0 driver found here: http://mongodb.github.io/mongo-csharp-driver/2.0/reference/driver/connecting/

The .net driver is now thread safe and handles connection pooling. According to documentation

It is recommended to store a MongoClient instance in a global place, either as a static variable or in an IoC container with a singleton lifetime.

runxc1 Bret Ferrier
  • 8,096
  • 14
  • 61
  • 100
9

The thing to remember about a static connection is that it's shared among all your threads. What you want is one connection per thread.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • You may have missed the part where I stated that one connection per thread is noticeably slow. I don't think that's the best answer for a high traffic website. – Tyler Brinks Feb 03 '10 at 18:05
  • 5
    For your sample, where you are grouping things, one per thread is the best you can do. A static, shared connection _will_ create deadlocks like you're seeing. Your alternative is to do connection pooling. That's something that the sql server provider has built-in but for mongo you'll have to build yourself, and it's not trivial to get right. – Joel Coehoorn Feb 03 '10 at 21:28
  • 1
    Looking at this again today, it's also possible that you have too many threads. Ideally, you want a shared, thread-safe queue for your work items, and only a handful a threads (the exact number varies depending on your system, but the biggest factor is the number of processor cores). Each thread pulls items from the queue. This would reduce the number of connections so they are no longer the bottleneck. – Joel Coehoorn Oct 27 '11 at 17:12
6

When using mongodb-csharp you treat it like you would an ADO connection. When you create a Mongo object it borrows a connection from the pool, which it owns until it is disposed. So after the using block the connection is back into the pool. Creating Mongo objects are cheap and fast.

Example

for(var i=0;i<100;i++) 
{ 
        using(var mongo1 = new Mongo()) 
        using(var mongo2 = new Mongo()) 
        { 
                mongo1.Connect(); 
                mongo2.Connect(); 
        } 
} 

Database Log
Wed Jun 02 20:54:21 connection accepted from 127.0.0.1:58214 #1
Wed Jun 02 20:54:21 connection accepted from 127.0.0.1:58215 #2
Wed Jun 02 20:54:21 MessagingPort recv() errno:0 No error 127.0.0.1:58214
Wed Jun 02 20:54:21 end connection 127.0.0.1:58214
Wed Jun 02 20:54:21 MessagingPort recv() errno:0 No error 127.0.0.1:58215
Wed Jun 02 20:54:21 end connection 127.0.0.1:58215

Notice it only opened 2 connections.

I put this together using mongodb-csharp forum. http://groups.google.com/group/mongodb-csharp/browse_thread/thread/867fa78d726b1d4

Donny V.
  • 22,248
  • 13
  • 65
  • 79
1

Somewhat but still of interest is CSMongo, a C# driver for MongoDB created by the developer of jLinq. Here's a sample:

//create a database instance
using (MongoDatabase database = new MongoDatabase(connectionString)) {

    //create a new document to add
    MongoDocument document = new MongoDocument(new {
        name = "Hugo",
        age = 30,
        admin = false
    });

    //create entire objects with anonymous types
    document += new {
        admin = true,
        website = "http://www.hugoware.net",
        settings = new {
            color = "orange",
            highlight = "yellow",
            background = "abstract.jpg"
        }
    };

    //remove fields entirely
    document -= "languages";
    document -= new[] { "website", "settings.highlight" };

    //or even attach other documents
    MongoDocument stuff = new MongoDocument(new {
        computers = new [] { 
            "Dell XPS", 
            "Sony VAIO", 
            "Macbook Pro" 
            }
        });
    document += stuff;

    //insert the document immediately
    database.Insert("users", document);

}
David Robbins
  • 9,996
  • 7
  • 51
  • 82
0

Connection Pool should be your answer.

The feature is being developed (please see http://jira.mongodb.org/browse/CSHARP-9 for more detail).

Right now, for web application, the best practice is to connect at the BeginRequest and release the connection at EndRequest. But to me, I think that operation is too expensive for each request without Connection Pool. So I decide to have the global Mongo object and using that as shared resource for every threads (If you get the latest C# driver from github right now, they also improve the performance for concurrency a bit).

I don't know the disadvantage for using Global Mongo object. So let's wait for another expert to comment on this.

But I think I can live with it until the feature(Connection pool) have been completed.

ensecoz
  • 900
  • 10
  • 16
  • Do you use the same way with connection to SQL Server/MySQL? I think the best practices with connection pooling is still "open late, close early", and it almost cost nothing to open/close a connection many times during a request. – Tien Do Dec 22 '11 at 04:19
0

I am using csharp-mongodb driver and it doesn't help me with his connection pool :( I have about 10-20 request to mongodb per web request.(150 users online - average) And i can't even monitor statistics or connect to mongodb from shell it throw exception to me.

I have created repository, which open and dispose connection per request. I rely on such things as: 1) Driver has connection pool 2) After my research(i have posted some question in user groups about this) - i understood that creating mongo object and open connection doesn't heavy operation, so heavy operation.

But today my production go down :( May be i have to save open connection per request...

here is link to user group http://groups.google.com/group/mongodb-user/browse_thread/thread/3d4a4e6c5eb48be3#

Antony Blazer
  • 705
  • 3
  • 19
  • If you Dispose your connections you are actually fighting with the connection pool -- the pool cannot recycle a disposed connection, and must go through the overhead of establishing a brand new connection for every request. Just use your connection, and close it when you're done with it. – Curt Apr 07 '21 at 01:48