0

I have C++ application that allocate huge buffer of memory - 1 GB minimum, but often 2 or 4 GB, sometimes even more.

I allocate using operator new, which is practically the same as malloc (this is why I added tag C)

Lately I am thinking to do some change - I can create a huge file and mmap it, instead of using allocation. Varnish suppose to do something like that.

This will help if computer have no memory, program will work on the disk, else it will work as if I did allocated with operator new or malloc.

However I am not sure what mmap options I should choose - map no need to be shared, also no need to be synchronized to the disk, unless this is necessary (e.g. no memory).

I know about madvise - access will be random.

I want to be portable between Linux and MacOS, but Linux only solution is ok too.

As a bonus, I want to be able to synchronize the mmap memory to the file, but only when I decide.

tadman
  • 208,517
  • 23
  • 234
  • 262
Nick
  • 9,962
  • 4
  • 42
  • 80
  • 1
    when you malloc you do not really allocate the memory – 0___________ Aug 29 '23 at 20:45
  • Strange you say that. What malloc do then? adjust brk and give you an address or make anonymous mmap and give you address. Sure I agree. – Nick Aug 29 '23 at 20:50
  • 1
    It will allocate when you access that memory – 0___________ Aug 29 '23 at 20:52
  • Sure. Most of the time, I use all the memory I allocate – Nick Aug 29 '23 at 20:53
  • 3
    `mmap()` still has to use memory. The only difference is that the memory is backed by the swap partition instead of a named file. – Barmar Aug 29 '23 at 21:27
  • 3
    Remember, the memory you care about in user applications is *virtual* memory, not RAM. The VM implementation takes care of moving data between disk and RAM as needed. `mmap()` doesn't make any new memory available. – Barmar Aug 29 '23 at 21:28
  • @Barmar This is exactly what i want. But not sure what options I should use for mmap call, so I have same performace as if i did malloc – Nick Aug 29 '23 at 21:29
  • The only way `mmap()` could help is if you've filled up your swap area. That should never happen on a properly configured system. – Barmar Aug 29 '23 at 21:29
  • 1
    You don't need to use mmap at all. You're barking up the wrong tree, this isn't something you need to worry about. – Barmar Aug 29 '23 at 21:30
  • @Barmar yes. This is also what I want. But also when program crash, i will have copy of my memory as well. And hopefully will be able to recover some of the data - not guaranteed, but eventually. There are other considerations too, but they are not that important. Will map_private be enought? – Nick Aug 29 '23 at 21:33
  • 2
    If the program crashes, the memory will be saved in a `core` file. – Barmar Aug 29 '23 at 21:34
  • Except it wont, since these are usually disabled :) using mmap we may have core for free. (Except map is sill not synchronized, but let say that's ok) – Nick Aug 29 '23 at 21:35
  • If you care about core dumps, turn that on, it's trivial to enable. Having a memory mapped allocator is *extremely* weird, it's not something people normally do, and I'm not sure it helps one bit here. Just allocate memory normally. What failure conditions are you concerned about? On modern hardware I'm not sure 4GB is a "huge" allocation. Chrome regularly uses 4x that for no particular reason. – tadman Aug 29 '23 at 21:53
  • Allocation is user configurable. For some workloads I do 2 x 15 gb, in order to use server with 32 gb ram. Theoretically, ne can do more. – Nick Aug 29 '23 at 21:57
  • "same performance as malloc" + "data gets synchronized to disk" + "but only under conditions" - I don't there is a combination of `mmap` flags that would do that. If you want a file mapping, there's `MAP_SHARED`, that's about it. – teapot418 Aug 29 '23 at 22:07
  • @teapot418 what about map_private with a file? Map not need to be shared between processes. – Nick Aug 29 '23 at 22:22
  • Wouldn't malloc() use mmap()? See [this](https://stackoverflow.com/questions/33128587/why-does-malloc-rely-on-mmap-starting-from-a-certain-threshold) question. – dimich Aug 29 '23 at 23:22
  • [MAP_PRIVATE - updates are not carried through to the underlying file.](https://man7.org/linux/man-pages/man2/mmap.2.html) – teapot418 Aug 30 '23 at 00:06
  • @teapot418 so basically, if I do with map_shared, performance will be not as good as malloc. And if I do with map_private, the file will not carry changes, unless, I really run out of memory and the system will use the file as swap. So it really not have any benefits, because it will not do what I want. Thanks. – Nick Aug 30 '23 at 04:16
  • 1
    On Linux you could do a shared mapping with a file from a `tmpfs` mount. Should perform like ram+swap and you get to extract the file afterwards if your program falls over. But it is a fairly weird contraption. – teapot418 Aug 30 '23 at 08:25
  • @teapot418 this sounds very interesting but sure is weird. I'll do my answer of the my question tomorrow or will close the question, as soon as I finish my research. Thanks – Nick Aug 30 '23 at 08:28

0 Answers0