You can use the managed_mapped_file
to transparently allocate from a memory mapped file.
This means that for all practical purposes you often don't need to dubdivide your memory areas. It's all virtual memory anyways, so paging takes care of loading the right bits at the required times.
Obviously, if there's a lot of fragmentation or accesses "jumping around" then paging might become a performance bottleneck. In that case, consider subdividing into pools and allocate from those.)_
Edit Just noticed Boost IPC has support for this under Segregated storage node allocators and Adaptive pool node allocators. There are also notes about the implementation of these storage pools here.
Here's a simple starting point that creates a 50Gb file and stuffs some data in it:
#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <algorithm>
#include <boost/container/flat_map.hpp>
#include <boost/container/flat_set.hpp>
#include <boost/interprocess/managed_mapped_file.hpp>
#include <boost/container/scoped_allocator.hpp>
#include <boost/interprocess/containers/string.hpp>
#include <boost/interprocess/containers/vector.hpp>
#include <boost/interprocess/sync/named_mutex.hpp>
#include <boost/interprocess/sync/scoped_lock.hpp>
namespace bip = boost::interprocess;
using mutex_type = bip::named_mutex;
struct X
{
char buf[100];
double rate;
uint32_t samples[1024];
};
template <typename T> using shared_alloc = bip::allocator<T,bip::managed_mapped_file::segment_manager>;
template <typename T> using shared_vector = boost::container::vector<T, shared_alloc<T> >;
template <typename K, typename V, typename P = std::pair<K,V>, typename Cmp = std::less<K> >
using shared_map = boost::container::flat_map<K, V, Cmp, shared_alloc<P> >;
using shared_string = bip::basic_string<char,std::char_traits<char>,shared_alloc<char> >;
using dataset_t = shared_map<shared_string, shared_vector<X> >;
struct mutex_remove
{
mutex_remove() { mutex_type::remove("7FD6D7E8-320B-11DC-82CF-39598D556B0E"); }
~mutex_remove(){ mutex_type::remove("7FD6D7E8-320B-11DC-82CF-39598D556B0E"); }
} remover;
static mutex_type mutex(bip::open_or_create,"7FD6D7E8-320B-11DC-82CF-39598D556B0E");
static dataset_t& shared_instance()
{
bip::scoped_lock<mutex_type> lock(mutex);
static bip::managed_mapped_file seg(bip::open_or_create,"./demo.db", 50ul<<30); // "50Gb ought to be enough for anyone"
static dataset_t* _instance = seg.find_or_construct<dataset_t>
("DATA")
(
std::less<shared_string>(),
dataset_t::allocator_type(seg.get_segment_manager())
);
static auto capacity = seg.get_free_memory();
std::cerr << "Free space: " << (capacity>>30) << "g\n";
return *_instance;
}
int main()
{
auto& db = shared_instance();
bip::scoped_lock<mutex_type> lock(mutex);
auto alloc = db.get_allocator().get_segment_manager();
std::cout << db.size() << '\n';
for (int i = 0; i < 1000; ++i)
{
std::string key_ = "item" + std::to_string(i);
shared_string key(alloc);
key.assign(key_.begin(), key_.end());
auto value = shared_vector<X>(alloc);
value.resize(size_t(rand()%(1ul<<9)));
auto entry = std::make_pair(key, value);
db.insert(std::make_pair(key, value));
}
}
Note that it writes a sparse file of 50G. The actual size commited depends on a bit of random there. My run resulted in a roughly 1.1G:
$ du -shc --apparent-size demo.db
50G demo.db
$ du -shc demo.db
1,1G demo.db
Hope this helps