I am involved in a project where they get enough RAM to store the entire database in memory. According to the manager, that is what 10Gen recommended. This is counter intuitive. Is that really the way you want to use Mongodb?
-
1If you can, that's recommended for all databases. – cirrus Feb 04 '14 at 02:49
-
2To be clear, the recommendation is to have enough RAM to store your [*working set*](http://docs.mongodb.org/manual/faq/storage/#what-is-the-working-set) of data & indexes in RAM. Typically your working set is smaller than the total size of your database. Disk I/O is slow compared to reading from memory, so you want to minimise this. For a nice visual see this blog post: [––thursday #4: blockdev](http://www.kchodorow.com/blog/2012/04/05/thursday-4-blockdev/). Note the small red lines at the top showing the relative access time from cache (ns) or RAM (ns), and compare with that for a HD (ms). – Stennie Feb 04 '14 at 03:36
-
Yes it is. MongoDB always tries to fit the whole database into memory to enhance performance. Only when there's not enough memory does MongoDB begin to switch with disk. And "page faults" is one of the key parameters to mesure if you had enough memory. So, by default it's already working this way, you don't need to worry about design because it's transparent to application. Just keep in mind enough memory makes it run faster. – yaoxing Feb 04 '14 at 04:12
-
@yaoxingno it will only try and fit the working set into RAM, not the entire database, that is a common misconception – Sammaye Feb 04 '14 at 08:10
2 Answers
It is not counter intuitive... I find it quite intuitive, actually.
In How much faster is the memory usually than the disk? you can read:
(...) memory is only about 6 times faster when you're doing sequential access (350 Mvalues/sec for memory compared with 58 Mvalues/sec for disk); but it's about 100,000 times faster when you're doing random access.
So if you can fit all your data in RAM, it is quite good because you are going to be really fast reading your data.
Regarding MongoDB, from the FAQ's:
It’s certainly possible to run MongoDB on a machine with a small amount of free RAM.
MongoDB automatically uses all free memory on the machine as its cache. System resource monitors show that MongoDB uses a lot of memory, but its usage is dynamic. If another process suddenly needs half the server’s RAM, MongoDB will yield cached memory to the other process.
Technically, the operating system’s virtual memory subsystem manages MongoDB’s memory. This means that MongoDB will use as much free memory as it can, swapping to disk as needed. Deployments with enough memory to fit the application’s working data set in RAM will achieve the best performance.
The problem is that you usually have much more data than memory available. And then you have to go to disk, and disk I/O is slow. Regarding database performance, avoiding full scan queries is key (much more important when accessing to disk). Therefore, if your data set does not fit in memory, you should aim at having indexes for the vast majority of your access patterns and try to fit those indexes in memory:
If you have created indexes for your queries and your working data set fits in RAM, MongoDB serves all queries from memory.
-
Fitting the whole database is counter-intuitive, RAM can be quite expensive and databases quite large (I mean what if your database is 300GB??), instead you want to, as MongoDB, fit the working set into RAM. – Sammaye Feb 04 '14 at 08:11
-
you guys are only arguing the obvious. of course a child would know memory access is faster than hard drive. However, it is also many times more expensive than the hard drive. Imagine if you have terabytes of data in any database, relational ones or nosql ones, load the entire content into memory is just not what you call a scalable solution. I certainly understand fitting the entire index, but not the database content. working set may be ok in the sense, memory is used like database cache. In a reasonable database system, some cache misses will have to be tolerated. AFAIK,that's common place. – bhomass Feb 05 '14 at 06:45
It all depends on the size of your database. I am guessing that you said your database was actually quite small, otherwise I cannot see how someone at 10gen gave such advice, I mean not even @Stennie gives such advice (he is 10gen by the way).
Even if your database is small I don't see how the manager recommended that. MongoDB does not do memory management of its own as such it does not "pin" data into pages like memcached does or other memory based databases do.
This means that the paging of mongod
s data can be quite unpredicatable, a.k.a you will spend more time trying to keep things in RAM than paging in data. This is why it is better to just make sure your working set fits and it can loaded with speed, such things are based upon your hardware and queries.
@Stennies comment pretty much sums up the stance you should be taking with MongoDB.

- 43,242
- 7
- 104
- 146