mmap
has many uses but I primarily use mmap
in two ways:
- mmap to read an entire file
- mmap to create shared memory.
In the case of shared memory if I want it backed by a file I use ftruncate()
to make sure the size of the file exactly matches the size of the region I wish to mmap
.
I'm curious about what happens if the file sizes does not match. The man page does not give sufficient detail here.
I think mmap has to handle this for the case of reading a file. For example, because the length of a file does not have to be a multiple of the page size and you might only want to read a portion of the file.
What about in the case of writing it?
Can the OS create the file backed storage lazily?
For example if I mmap 1Gb to an empty file.
Will the OS extend the file as necessary when I attempt to write to it?
My intuition (which is likely wrong) is that a mapping should behave consistently with file read()
and write()
calls.
So a read beyond the end of a file will fail
but a write beyond the end of a file will extend it as required.
In fact, I would also expect the page to be marked as dirty and only actually written to the physical file when msync()
or munmap()
is called. So the file only be extended at that point.
This question and answer however suggest that a bus error would occur.
Is that always the case or is it true for some combinations of flags only? What is going on under the hood here?
This is not something I'm trying to do but a hypothetical use case might be a userspace implementation of swap which grows on demand. For example, if you were trying to save disk space.
Another way of posing the question is:
Is it actually necessary to call ftruncate()
before using mmap()
?
I'm interested in Linux specifically but curious about POSIX correctness as well.