The page size is 4096 bytes. Assume that you want a buffer twice as much, that is 8192 bytes.
If you use mmap
you will map 8192 bytes without doing anything else (reading the actual data from the disk).
Then when you access the first byte, a page fault will occur and you will do one I/O to read the first page from the disk. After reading this page, you will get the first byte as an answer.
Then when you access the 4097-th byte, a new page fault will occur and you will do an extra I/O to read the second page from the disk to get this byte.
However, if you use read
, you will only have to do one I/O to read 8192 bytes and then return the two bytes that you want.
This is a very small example, but I am kind of thinking what about if the buffer size is a few KB or MB? It looks like mmap with a page of size 4096 bytes will generate a lot of I/Os that can be avoided if you just the POSIX read
call instead, which makes me wonder, why use mmap
in the first place?