2

I am writing a program to read and write a file at the same time. More specifically, all write operations are appending new data to the end of the file and all read operations are reading random positions of the file.

I am thinking of creating memory-mapped file (using mmap) to achieve efficient read while writing via append (mode a in open). However, I don't think this will work because the memory-mapped file cannot change in size*, unless I munmap and then mmap it.

While "munmap and then mmap the file again" works, it has many downsides. Not only I need to perform 2 syscalls after every write (or before every read), which hurts performance, the base address returned from the next mmap call after munmap could be different from the previous one. Since I am planning to have other in-memory data structure storing pointers to specific offset of this memory mapped file, it could be very inconvenient.

Are there more elegant and efficient ways of doing this? The program will be mostly running on Linux (but solutions with portability to other POSIX systems are preferred). I have read through the following posts, but none of them seems to give a definitive answer.

How to portably extend a file accessed using mmap()

Can the OS automatically grow an mmap backed file?

Fast resize of a mmap file

My intuition is to use mmap to "reserve" the file with a size that is large enough to accommodate the growth of file, say a few hundred of GiB (that is a very reasonable assumption in my use case). And then somehow reflect the change of file size in this mapped memory without invalidating it with munmap. However, I am aware that accessing data beyond the real file boundary could result in a bus error. And the documentation isn't clear about whether changes in file size will get reflected.

*I am not 100% sure about this, but I couldn't find any source of elegantly changing the size of memory-mapped file.

lewisxy
  • 137
  • 8
  • There is an `mremap()` to remap with a new size, but the function is Linux-specific. The `_GNU_SOURCE` feature test macro needs to be defined. See [mremap(2)](https://man7.org/linux/man-pages/man2/mremap.2.html). – Ian Abbott Oct 29 '22 at 07:54

2 Answers2

1

the memory-mapped file cannot change in size

Yes it can. Just use ftruncate to grow the file.

It's hard to change the size of the mapping, but that's separate, and you can have multiple partial mappings. So the trick is to map the file in discrete fixed-size segments.

It's generally preferable not to require the whole file to be mapped all the time, because it limits you to files that fit in memory. But, if you want to keep random pointers into the file, then keeping an LRU cache of segments is probably not possible.

Useless
  • 64,155
  • 6
  • 88
  • 132
  • I guess I wasn't super clear in the original question. There is no issue for me to grow the size of the file. The issue is to maintain the memory mapping up to date while growing the size of the file. – lewisxy Oct 29 '22 at 19:11
  • "It's generally preferable not to require the whole file to be mapped all the time, because it limits you to files that fit in memory." Is that true? I doubt mapping large file actually takes a large amount of physical memory. It should just take the virtual memory, which is almost free. – lewisxy Oct 29 '22 at 19:14
  • In practice the page cache will do implicitly exactly the same thing I'm suggesting you do explicitly, so long as the file fits in your address space, but: that still doesn't help you with growing the mapping. And if you're really randomly accessing the whole file you'll really thrash that page cache. And if you're not randomly accessing the whole file, you don't need to keep pointers into it in the first place. – Useless Oct 30 '22 at 12:50
  • I did found a way to achieve what I want (see my answer), although I did not explicitly found the documentation to some behaviors that is required for this approach to work (it is working fine on recent versions of Linux kernel and macOS though) – lewisxy Nov 01 '22 at 23:26
  • _It's hard to change the size of the mapping,_ - on Linux you can `mmap` over an existing mapping, with full or partial overlap. – Maxim Egorushkin Nov 01 '22 at 23:27
0

After some experimentations, I found a way to make it work.

First mmap the file with PROT_NONE and a large enough size. For 64-bit systems, it can be as large 1L << 46 (64TB). This does NOT consume physical memory* (at least on Linux). It will consume address space (virtual memory) for this process.

void* ptr = mmap(NULL, (1L << 40), PROT_NONE, MAP_SHARED, fd, 0);

Then, give read (and/or write) permission to the part of memory within file length using mprotect. Note that size need to be aligned with page size (which can be obtained by sysconf(_SC_PAGESIZE), usually 4096).

mprotect(ptr, aligned_size, PROT_READ | PROT_WRITE);

However, if file size is not page-size aligned, reading the portion within mapped region (with PROT_READ permission) but beyond file length will trigger a bus error, as documented on mmap manual.

Then you can use either file descriptor fd or the mapped memory to read and write file. Remember to use fsync or msync to persist the data after writing to it. The memory-mapped page with PROT_READ permission should get the latest file content (if you write to it)**. The newly mapped page with mprotect will also get the newly updated page.

Depending on the application, you might want to use ftruncate to make the file size aligned to system page size for the best performance. You might also want to use madvise with MADV_SEQUENTIAL to improve performance when reading those pages.

*This behavior is not mentioned on the manual of mmap. However, since PROT_NONE implies those pages are not accessible in anyway, it's trivial for any OS implementation to not allocating any physical memory to it at all.

**This behavior of memory region mapped before a file write getting updated after the write is completed (fsync or msync) is also not mentioned on the manual (or at least I did not see it). But it seems to be the case at least on recent Linux kernels (4.x onward).

lewisxy
  • 137
  • 8