Reserving a large virtual memory space for a dynamic array and mapping core memory as needed

Question

I need a large dynamic array. I don't know the maximum size it can reach, but I can set a large upper bound, like 1 gigabyte.

The dynamic array implementations I know, when they reach their max capacity, allocate a new larger buffer, copy the data to it and deallocate the old buffer. I would like to avoid that, so I'm thinking about reserving a large chunk of virtual memory and only map the virtual memory pages onto physical memory when they are needed. Besides efficiency, a nice feature of this method is that the address of the items is guaranteed to never change.

I'm thinking about a logic similar to this:

// the memory used by the dynamic array
item_t* buffer = reserve_virtual_memory( 1gigabyte );

size_t len = 0; // how many items the dynamic array contains
size_t pages = 0; // how many virtual memory pages are in use

// computes how many memory pages are needed to store `len` items
size_t needed_pages( size_t len ) {
  return ( sizeof(item_t)*len - 1 ) / page_size + 1;
}

item_t* new_item() {
  len += 1;
  if( needed_pages(len) != pages ) {
    ASSERT( needed_pages(len) == pages+1 );
    pages += 1;
    map_memory_page( buffer + pages*page_size );
  }
}

void pop_item() {
  len -= 1;
  if( needed_pages(len) != pages ) {
    ASSERT( needed_pages(len) == pages-1 );
    release_memory_page( buffer + pages*page_size );
    pages -= 1;
  }
}

I should be able to implement this logic on Linux, using mmap and madvise.

I'm wondering:

Are there any drawbacks to use this design for a large dynamic array?
Is this a common solution? Does it have a name? Are there any libraries that already implement it?
Can it be implemented on every/most platform? Including virtual machines such as WebAssembly?

Well... Either? Or maybe neither? It's more of a question about Operating Systems' API, than about programming languages. But those APIs usually have a C interface compatible with C++. — Helloer, May 03 '21 at 03:24
AFAIK, if you allocate memory (with `malloc` or `operator new`), the allocated bytes won't be mapped into physical memory until they are accessed. You can do this even with `std::vector`, but only with a custom allocator. — Daniel Langr, May 03 '21 at 03:28
`boost::interprocess` has containers such as shared memory vector, array, etc.. that handles all the same logic of storing something in shared memory. Alternatively, you can also write your own custom allocator to do it + use offset_ptr or similar. All platforms support this allocation scheme of storing stuff in shm (Chrome does it as well to share data between tabs/windows). Not sure for WebAssembly though.. but I've done it on a Raspberry Pi 3 and 4, Ubuntu, Mint, MacOS, Windows, etc. — Brandon, May 03 '21 at 03:29
@DanielLangr is that the guaranteed behavior on every important platform? Including WebAssembly? A problem of that approach is that memory is not released when no longer in use, but I guess it's not a major issue and it could work as a fallback implementation. — Helloer, May 03 '21 at 03:31
@Helloer No, it's not, the C++ standard doesn't care about such things as memory pages and their mapping into physical memory. — Daniel Langr, May 03 '21 at 03:40
@Brandon I didn't know about `boost::interprocess`, I'm glancing at the documentation. It seems to use the same approach as `std::vector`: allocate new buffer, copy data, delete old buffer. The implementation of managed memory segments offer a `grow` and `shrink_to_fit`. — Helloer, May 03 '21 at 03:44
If it's C++ why not use `std::deque`? Has essentially all the properties you want. — dabo42, May 03 '21 at 03:48
@DanielLangr: of course the C++ standard doesn't say anything about that. C++ programs can even be compiled for and ran on systems without virtual memory. But I'm wondering whether, in practice, all the "relevant" target platforms implement `malloc` with allocate-on-write. I assume the answer is yes for normal OSes, but I'm not sure about WASM and other virtual machines. — Helloer, May 03 '21 at 03:50
@dabo42: items in an `std::deque` aren't contiguous and I need that. Otherwise it would be a perfect solution. — Helloer, May 03 '21 at 03:50
To me it's unclear which problem you are trying to solve. On most systems virtual pages ain't mapped to physical pages until they are used. So it seems many systems already do what you ask for. — Support Ukraine, May 03 '21 at 05:34
I expect std::vector::reserve() is the function you are looking for. I allocates a large underlying buffer, but does not change the "visible" size of the vector. As long as nothing is written to the allocated memory, the OS will most likely not map all the virtual memory pages to physical memory. — olm, May 03 '21 at 09:24
The question really is "is this any better than just allocating an array of the upper-bound size". As @DanielLangr said the answer may well be "no" because at least on Linux `malloc` falls back to `mmap` for large sizes anyway and the OS won't actually map pages until you touch them. — dabo42, May 03 '21 at 16:27
Tested on Linux Mint, Ubuntu, Debian, Raspbian, and MacOS.. allocating with a custom allocator and malloc does NOT map the pages. So calling `std::vector::reserve` does the allocation, however, the memory isn't actually used until you write to it via `std::vector::push_back` or `std::vector::resize`. I allocated 4GB of memory with reserve, then wrote to 1MB, 100MB, 1GB of it. When I queried how much resident memory was used, it showed only 1MB, 100MB, and 1GB respectively. If I wrote to all 4GB of it, it shows 4GB being used. This holds true for the default allocator as well. — Brandon, May 04 '21 at 13:57
Test: https://pastebin.com/D5KFep7R I used code from this thread to determine memory usage: https://stackoverflow.com/questions/63166/how-to-determine-cpu-and-memory-consumption-from-inside-a-process — Brandon, May 04 '21 at 14:13

score 0 · Answer 1 · answered Dec 03 '21 at 23:38

One potential drawback is that you could more easily exhaust the address space of your process. In your example you reserve 1 GiB of virtual memory (even if the array is empty) which would be half of the 2 GiB private address space for a process on a typical 32-bit Windows install. That would only leave 1 GiB of memory for other data. This won't be an issue for most systems nowadays since they are typically 64-bit systems but it could be an issue if you still want to support older or smaller systems.
I found your question while searching the web for this idea for about 30 minutes. I would say that this doesn't appear to be a common solution.

Demand paging and memory overcommitment seem somewhat related to this idea though because they allow more memory to be allocated than what is actually resident in physical memory. Both of these are used commonly.

Reserving a large virtual memory space for a dynamic array and mapping core memory as needed

1 Answers1