I need a large dynamic array. I don't know the maximum size it can reach, but I can set a large upper bound, like 1 gigabyte.
The dynamic array implementations I know, when they reach their max capacity, allocate a new larger buffer, copy the data to it and deallocate the old buffer. I would like to avoid that, so I'm thinking about reserving a large chunk of virtual memory and only map the virtual memory pages onto physical memory when they are needed. Besides efficiency, a nice feature of this method is that the address of the items is guaranteed to never change.
I'm thinking about a logic similar to this:
// the memory used by the dynamic array
item_t* buffer = reserve_virtual_memory( 1gigabyte );
size_t len = 0; // how many items the dynamic array contains
size_t pages = 0; // how many virtual memory pages are in use
// computes how many memory pages are needed to store `len` items
size_t needed_pages( size_t len ) {
return ( sizeof(item_t)*len - 1 ) / page_size + 1;
}
item_t* new_item() {
len += 1;
if( needed_pages(len) != pages ) {
ASSERT( needed_pages(len) == pages+1 );
pages += 1;
map_memory_page( buffer + pages*page_size );
}
}
void pop_item() {
len -= 1;
if( needed_pages(len) != pages ) {
ASSERT( needed_pages(len) == pages-1 );
release_memory_page( buffer + pages*page_size );
pages -= 1;
}
}
I should be able to implement this logic on Linux, using mmap
and madvise
.
I'm wondering:
Are there any drawbacks to use this design for a large dynamic array?
Is this a common solution? Does it have a name? Are there any libraries that already implement it?
Can it be implemented on every/most platform? Including virtual machines such as WebAssembly?