Library or tools for managing shared mmapped files

Question

Disclaimer: This is probably a research question as I cannot find what I am looking for, and it is rather specific.

Problem: I have a custom search application that needs to read between 100K and 10M files that are between 0.01MB to about 10.0MB each. Each file contains one array that could be directly loaded as an array via mmap. I am looking for a solution to prefetch files into RAM before they are needed and if the system memory is full, eject ones that were already processed.

I know this sounds a lot like a combination of OS memory management and something like memcached. What I am actually looking for is something like memcached that doesn't return strings or values for a key, but rather the address for the start of a chosen array. In addition, (this is a different topic) I would like to be able to have the shared memory managed such that the distance between the CPU core and the RAM is the shortest on NUMA machines.

My question is: "does a tool/library like this already exist?"

score 1 · Answer 1 · edited May 23 '17 at 11:55

1

Your question is related to this one

I'm not sure you need to find a library. You just need to understand how to efficiently use system calls.

I believe the readahead system call could help you.

edited May 23 '17 at 11:55

Community

1
1

answered Nov 09 '11 at 20:19

Basile Starynkevitch

223,805
18
296
547

Your comments are pointing me in the right direction. The primary differences between this question and [the other](http://stackoverflow.com/questions/8056984/speeding-up-file-i-o-mmap-vs-read) is here the number of files is high, but each file is relatively small. In the other case, the opposite is true. What I really need to do is nonblocking I/O (at least from the point of view of the consumer) and keep the kernel from paging out those files that were not yet read by the consumer. Given enough memory, I would just keep all of the arrays in memory. – jeff.vanvoorst Nov 10 '11 at 14:18
If you can predict what files (or portion of them) will be needed in the next seconds, using `readahead` system call (perhaps in separate threads) should be helpful. – Basile Starynkevitch Nov 10 '11 at 16:21
I am a bit concerned that readahead is a blocking call. Also, the primary motivation for this question is looking to the future as for one CPU core the I/O pattern appears to be important, and I would like to scale to many cores without starving them. Another (separate issue) is most I/O profilers are buggered for processes with one thread, and it is much worse for multithreaded processes. – jeff.vanvoorst Nov 10 '11 at 17:14

score 0 · Answer 2 · answered Nov 10 '11 at 18:24

0

Indeed you have many many files (and perhaps too much of them). I hope that your filesystem is good enough, or that they are in many directories. Having millions of files may become a concern if not tuned appropriately (but I won't dare help on this).

I don't know if it is your application who writes & reads that many files. Perhaps you might consider switching to a fast DBMS like PostGresQL or MySQL, or perhaps you could use GDBM.

answered Nov 10 '11 at 18:24

Basile Starynkevitch

223,805
18
296
547

We have tried using MySQL in the past, but it takes approximately ten times longer than reading the files from a Linux filesystem. Also, the application is more like a search engine described below. The read speed is what counts, the writes are done periodically, don't overwrite any previous data, and may be slow. – jeff.vanvoorst Nov 18 '11 at 16:17
GDBM might also be useful. I believe it can be faster than MySQL, and probably even than a file system with millions of files. – Basile Starynkevitch Nov 18 '11 at 16:57

score 0 · Answer 3 · answered Nov 10 '11 at 18:59

I have once done this for a search-engine kind of application. It used an LRU chain, which was also addressable (via a hash table) by file-id, and memory-address IIRC. On every access, the hot items were repositioned to the head of the LRU chain. When memory got tight (mmap can fail ...) the tail of the LRU-chain was unmapped.

The pitfall of this scheme is that the program can get blocked on pagefaults. And since it was single threaded, it was really blocked. Altering this to a multithreaded architecture would involve protecting the hash and LRU structures by locks and semaphores.

After that, I realised that I was doing double buffering: the OS itself has a perfect LRU diskbuffer mechanism, which is probably smarter then mine. Just open()ing or mmap()ing every single file on every request is only one sytemcall away, and (given recent activity) just as fast, or even faster than the buffering layer.

wrt DBMS: using a DBMS is a clean design, but you have the overhead of minimal 3 systemcalls just to get the first block of data. And it will certainly (always) block. But it lends itself reasonably for a multi-threaded design, and relieves you from the pain of locks and buffer management.

I've been thinking along these lines. We may just buy a dedicated machine with enough RAM to hold the entire dataset. In my hands, there was no difference between using C++ fstreams, read(), or mmap. As long as the OS perceives that there is sufficient file buffer space, the reads are quite fast. — jeff.vanvoorst, Nov 18 '11 at 16:14

Library or tools for managing shared mmapped files

3 Answers3