2

I'm looking to mmap a region of a file, make changes, and then either msync(MS_INVALIDATE) them so that my in memory changes are lost, or, should I like the changes, msync(MS_SYNC) to write them back to the underlying storage.

The msync manpage says the following on MS_INVALIDATE:

When MS_INVALIDATE is specified, msync() shall invalidate all cached copies of mapped data that are inconsistent with the permanent storage locations such that subsequent references shall obtain data that was consistent with the permanent storage locations sometime between the call to msync() and the first subsequent memory reference to the data.

This gives the impression that an MS_INVALIDATE should reset revert the memory back to the state in storage. However the changes seem to have already made their way to the device by the time I make the msync call.

  1. Can I delay or prevent data from being written back to the device automatically?
  2. Is it possible to operate with manual write back only?

Notes

  • The underlying storage can be several TB in size.
Matt Joiner
  • 112,946
  • 110
  • 377
  • 526
  • I am not sure that I understand what you want to achieve. The whole point of `mmap` is that it is asynchronous and allows you not to worry when there is the best point to write out buffers. `msync` is just there to give you some sort of super-sequence point. If you just want to buffer parts of a file perhaps use `read/write/fseek`, no? – Jens Gustedt Dec 08 '11 at 07:46
  • 1
    @JensGustedt: I want to lazily fetch the file as required, it's potentially several TB. If I read/write, I'll have to maintain a cache, dirty markers, as well as copy and allocate lots of stuff around. If I can do it with mmap, I'll avoid all of that. – Matt Joiner Dec 08 '11 at 09:16
  • Sure, `mmap` is just made for that purpose, you just do your modifications and the systems schedules them to land on disk, eventually. You want to have some kind of roll back or versionning or so? – Jens Gustedt Dec 08 '11 at 09:26
  • 1
    @JensGustedt: Yes, rollback and commit are the intended goals. Just missing the commit part. – Matt Joiner Dec 08 '11 at 09:55

1 Answers1

0

If this is just for the purpose of having access to a large file and doing some local modifications, just do MAP_PRIVATE and throw away the mapping once you decide you want to have the original version again.

On a modern system the performance overhead should be negligible:

  • for your change copy the system would only need different physical pages for the ones that you changed
  • when mapping the file again (and again) the physical pages will still be in the page cache and no device IO should be necessary
Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • 1
    Yes this works, my only question now is how to get the dirty pages in my private mapping to the storage when I decide to "commit". This guy's answer nails my current thoughts: http://stackoverflow.com/a/4474496/149482 – Matt Joiner Dec 08 '11 at 09:54