Why does MongoDB perform better with multi-threaded client compared to a single threaded client?

Question

We recently benchmarked Oracle 10g and MongoDB with YCSB ( https://github.com/brianfrankcooper/YCSB/wiki ), when we tried to increase the number of threads for 1,000,000 datasets, Oracle's performance remained constant after 4 threads however MongoDB kept on performing better and better till 8 threads and after that only reads were better, writes and updates (operations/sec) remained constant.

We ran this benchmark on 2 CPU Xeon quad core (total 8 cores) + 8 GB RAM on LAN.

What we observed was that MongoDB performed better with multi-threaded client comparing to single-threaded client, my question is: when MongoDB can perform better with more load why can't it do the same with less load (say just a couple of threads) by utilizing the multiple cores?

score 8 · Accepted Answer · answered Jul 01 '11 at 06:06

8

It is logically very simple to process a request on a single core. Just have code that receives the request, and deals with it.

It is not nearly so simple to process a single request on 2 cores, because doing so requires you to break up the request into components, farm out the work, synchronize the answers, and then build up a single response. And if you do this work, while you can reduce wallclock time (how much time the clock on the wall sees pass), you're invariably going to make the request take more CPU time (total CPU resources consumed).

In a system like MongoDB where you expect to have a lot of different clients making requests, there is no need to try to parallelize the handling of a single request, and every reason not to.

The bigger question is why Oracle didn't increase concurrency after 4 CPUs. There are any number of possible reasons, but one reasonable guess is that you encountered some sort of locking which is needed to guarantee consistency. (MongoDB does not offer you consistency, and so avoids this type of bottleneck.)

answered Jul 01 '11 at 06:06

btilly

43,296
3
59
88

Ya, that's what I suspected about Oracle, a lock to maintain the consistency. – vikasing Jul 01 '11 at 09:55
Was thinking from the perspective of optimizing the multiple cores, for example when there are only 2 parallel clients mongoDB should be able to optimize all the 8 cores on 2 cpus to give better results. In present case (2 parallel clients) other 6 cores will sit idle and other 2 cores will not give me the performance which 8 cores together would have given me, its like under utilization of CPUs at low loads. – vikasing Jul 01 '11 at 10:10
Oracle is MVCC, it doesn't lock. – TTT Jul 01 '11 at 10:59
2

@TTT: You are dead wrong if you think that Oracle doesn't lock. Oracle has gone to great lengths to have many fine-grained short-lived locks to maximize concurrency. But under the hood it is absolutely using locking to guarantee consistency, and I have personally seen Oracle databases fall over under load without warning many times due to obscure locking issues. (And when you resolve one locking issue, you'll encounter another.) – btilly Jul 01 '11 at 14:49
@vikasing: What you are saying is theoretically possible. But doing it would require a lot of effort, the returns would not be that great, and it would make the software worse in the use case that they are trying to design for. The current design performs acceptably in the low load case, and scales under load. – btilly Jul 01 '11 at 14:51
@btilly, It does locking on memory structures (the shared cache of of previously used sql statements + execution plans) but it does not lock data to maintain consistency. Reads don't block writes. – TTT Jul 01 '11 at 16:20
You can shard you mongodb database into different databases on the same machine. For each core one shard. This way the write locks hurt less. Here you can read how to get the max out of MongoDB (non-trivial stuff): http://nosql.mypopescu.com/post/6314585322/optimizing-mongodb-lessons-learned-at-localytics – TTT Jul 01 '11 at 16:26
1

@TTT: Only partially true. In Oracle, under load, accesses to various data structures require transient locks during index accesses of tables. I learned this after an experienced DBA diagnosed and fixed a problem by moving more data into the index so that the query didn't have to access the table at all, which shortened that lock. – btilly Jul 01 '11 at 16:58

TTT · Answer 2 · 2011-07-01T11:04:28.790

3

Oracle doesn't lock data for consistency but it does write data to redo and undo files for transactions and read consistency. Oracle is a MVCC system. See http://en.wikipedia.org/wiki/Multiversion_concurrency_control .

You have to use parameterized queries to make Oracle fast, else Oracle will spend too much time parsing queries. This is especially important when a lot of small queries run simultaneously, the situation you are testing.

MongoDB does lock on writes.

edit 1:

Another big difference between Oracle and MongoDB is durability. MongoDB doesn't offer durability if you use the default configuration. It writes once every minute data to the disk. Oracle writes to disk with every commit. So Oracle does a lot more fsyncing.

edited Jul 01 '11 at 11:04

answered Jul 01 '11 at 10:58

TTT

2,365
17
16

this was not the case with YCSB as its client code uses safe writes in default configuration. – Asya Kamsky Mar 11 '13 at 23:38
At least with driver versions going back to the fall of 2012 (1.5 years ago), the default behavior of mongoDB is to write the journal before returning control to the writing client, almost identical to what Oracle does. – Buzz Moschetti Jul 03 '14 at 02:15

Why does MongoDB perform better with multi-threaded client compared to a single threaded client?

2 Answers2

Linked