2

I am writing a C++17 application and I need to manage an STL or boost::collections equivalent data structure in shared memory.

I'm not sure of the simplest syntax (which avoids passing allocators all over the place) to create and update the shared data structure.

I have been searching for some time, but other than a trivial String->String map, examples focused on custom data structures or POD structs are hard to come by. (I suspect that the allocator associated with POD structs would be fairly easy, as these can be allocated from contiguous memory, and therefore could use the simple char allocator - equivalent to Shared::Alloc<char> below).

From what I understand, the key to managing collections of data structures in shared memory centers around the correct choice of a stateful allocators and the ability to have that allocator shared with its nested children.

So for example, let's say I have a map<Shared::String, vector<Shared::String>> in shared memory, somehow the magic of the scoped_allocator_adaptor would work.

Beyond the simple example of a map<SHMString, vector<String>> above, I would really like to manage a map<SHMString, vector<UserStruct>> where UserStruct can be either a POD struct or a struct containing a String or List of strings.

I started out with the following as a useful starting point from another answer I found in SO:

namespace bip = boost::interprocess;

namespace Shared {
    using Segment = bip::managed_shared_memory;

    template <typename T>
        using Alloc   = bip::allocator<T, Segment::segment_manager>;
    using Scoped  = boost::container::scoped_allocator_adaptor<Alloc<char>>;

    using String  = boost::container::basic_string<char, std::char_traits<char>, Scoped>;
    using KeyType = String;
}

It looks like the Shared:Scoped allocator adapter is key to propagating the allocator from a top level container to its children. I'm not sure if this is different when applied to the boost containers vs the standard containers.

An example, and explanation on how to construct these objects in a way that would allow me to propagate the scoped_allocator_adaptor to my POD or custom struct is what I am looking for.

sehe
  • 374,641
  • 47
  • 450
  • 633
johnco3
  • 2,401
  • 4
  • 35
  • 67

1 Answers1

3

Shooting for the stars, are we :) Painless allocator propagation is the holy grail.

It looks like the Shared:Scoped allocator adapter is key to propagating the allocator from a top level container to its children.

Indeed

I'm not sure if this is different when applied to the boost containers vs the standard containers.

In my understanding, modern C++ standard libraries should support the same, but in practice my experience has shown that it often worked with Boost Container containers. (YMMV and standard library implementations may/will catch up)

What To Do

I think you will want to understand the uses_allocator protocol: https://en.cppreference.com/w/cpp/memory/uses_allocator

enter image description here

This really answers all of your questions, I suppose. I'll try to come up with a quick sample if I can.

Demo

So far I have got the following two approaches working:

struct MyStruct {
    String data;

    using allocator_type = Alloc<char>;

    MyStruct(MyStruct const& rhs, allocator_type = {}) : data(rhs.data) {}
    template <typename I, typename = std::enable_if_t<not std::is_same_v<MyStruct, I>, void> >
    MyStruct(I&& init, allocator_type a)
     : data(std::forward<I>(init), a)
    { }
};

This allows:

Shared::Segment mf(bip::open_or_create, "test.bin", 10<<20);

auto& db = *mf.find_or_construct<Shared::Database>("db")(mf.get_segment_manager());

db.emplace_back("one");
db.emplace_back("two");
db.emplace_back("three");

The slightly more complicated/versatile (?) approach also works:

    MyStruct(std::allocator_arg_t, allocator_type, MyStruct const& rhs) : data(rhs.data) {}

    template <
        typename I,
        typename A = Alloc<char>,
        typename = std::enable_if_t<not std::is_same_v<MyStruct, I>, void> >
    MyStruct(std::allocator_arg_t, A alloc, I&& init)
     : data(std::forward<I>(init), alloc.get_segment_manager())
    { }

It appears that for the current use-case, the inner typedef allocator_type is enough to signal that MyStruct supports allocator-construction, making the specialization of uses_allocator<MyStruct, ...> redundant.

Full Listing

Live On Coliru

#include <boost/interprocess/containers/vector.hpp>
#include <boost/interprocess/containers/string.hpp>
#include <boost/interprocess/managed_mapped_file.hpp>
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/container/scoped_allocator.hpp>
#include <iostream>

namespace bip = boost::interprocess;

namespace Shared {
    using Segment = bip::managed_mapped_file;
    using SMgr = Segment::segment_manager;

    template <typename T> using Alloc = boost::container::scoped_allocator_adaptor<
            bip::allocator<T, SMgr>
        >;

    template <typename T> using Vec = boost::container::vector<T, Alloc<T> >;

    using String = bip::basic_string<char, std::char_traits<char>, Alloc<char> >;

    struct MyStruct {
        String data;

        using allocator_type = Alloc<char>;

#if 1 // one approach
        MyStruct(std::allocator_arg_t, allocator_type, MyStruct const& rhs) : data(rhs.data) {}

        template <
            typename I,
            typename A = Alloc<char>,
            typename = std::enable_if_t<not std::is_same_v<MyStruct, I>, void> >
        MyStruct(std::allocator_arg_t, A alloc, I&& init)
         : data(std::forward<I>(init), alloc.get_segment_manager())
        { }
#else // the simpler(?) approach
        MyStruct(MyStruct const& rhs, allocator_type = {}) : data(rhs.data) {}
        template <typename I, typename = std::enable_if_t<not std::is_same_v<MyStruct, I>, void> >
        MyStruct(I&& init, allocator_type a)
         : data(std::forward<I>(init), a)
        { }
#endif
    };

    using Database = Vec<MyStruct>;
}

namespace std {
    // this appears optional for the current use case
    template <typename T> struct uses_allocator<Shared::MyStruct, T> : std::true_type {};
}

int main() {
    Shared::Segment mf(bip::open_or_create, "test.bin", 10<<20);

    auto& db = *mf.find_or_construct<Shared::Database>("db")(mf.get_segment_manager());

    db.emplace_back("one");
    db.emplace_back("two");
    db.emplace_back("three");

    std::cout << "db has " << db.size() << " elements:";

    for (auto& el : db) {
        std::cout << " " << el.data;
    }

    std::cout << std::endl;
}

Invoking it three times:

db has 3 elements: one two three
db has 6 elements: one two three one two three
db has 9 elements: one two three one two three one two three

Update: More Complicated

In response to the comments, let's make it more complicated in two ways:

  • The struct constructor will take various arguments initializing various members, some of which will use an allocator.
  • We want to store it in a Map, and some of the use-patterns involving map are pesky with scoped allocator support (emplacement, map[k]=v update-assignment with default-construction requirements)
  • std::initalizer_list<> will not be deduced in generic forwarding wrappers :(

Defining the struct:

struct MyPodStruct {
    using allocator_type = ScopedAlloc<char>;

    int a = 0; // simplify default constructor using NSMI
    int b = 0;
    Vec<uint8_t> data;

    explicit MyPodStruct(allocator_type alloc) : data(alloc) {}
    //MyPodStruct(MyPodStruct const&) = default;
    //MyPodStruct(MyPodStruct&&) = default;
    //MyPodStruct& operator=(MyPodStruct const&) = default;
    //MyPodStruct& operator=(MyPodStruct&&) = default;

    MyPodStruct(std::allocator_arg_t, allocator_type, MyPodStruct&& rhs) : MyPodStruct(std::move(rhs)) {}
    MyPodStruct(std::allocator_arg_t, allocator_type, MyPodStruct const& rhs) : MyPodStruct(rhs) {}

    template <typename I, typename A = Alloc<char>>
        MyPodStruct(std::allocator_arg_t, A alloc, int a, int b, I&& init)
         : MyPodStruct(a, b, Vec<uint8_t>(std::forward<I>(init), alloc)) { }

  private:
    explicit MyPodStruct(int a, int b, Vec<uint8_t> data) : a(a), b(b), data(std::move(data)) {}
};    

It addresses "default construction" (under uses-allocator regime), and the various constructors that take multiple arguments. Not that SFINAE is no longer required to disambiguate the uses-allocator copy-constructor, because the number of arguments differs.

Now, using it is more involved than above. Specifically, since there are multiple constructor arguments to be forwarded, we need another bit of "construction protocol": std::piece_wise_construct_t.

The inline comments talk about QoL/QoI concerns and pitfalls:

int main() {
    using Shared::MyPodStruct;
    Shared::Segment mf(bip::open_or_create, "test.bin", 10<<10); // smaller for Coliru
    auto mgr = mf.get_segment_manager();

    auto& db = *mf.find_or_construct<Shared::Database>("complex")(mgr);

    // Issues with brace-enclosed initializer list
    using Bytes = std::initializer_list<uint8_t>;

    // More magic: piecewise construction protocol :)
    static constexpr std::piecewise_construct_t pw{};
    using std::forward_as_tuple;
    db.emplace(pw, forward_as_tuple("one"), forward_as_tuple(1,2, Bytes {1,2}));
    db.emplace(pw, forward_as_tuple("two"), forward_as_tuple(2,3, Bytes {4}));
    db.emplace(pw, forward_as_tuple("three"), forward_as_tuple(3,4, Bytes {5,8}));

    std::cout << "\n=== Before updates\n" << db << std::endl;

    // Clumsy:
    db[Shared::String("one", mgr)] = MyPodStruct{std::allocator_arg, mgr, 1,20, Bytes {7,8,9}};

    // As efficient or better, and less clumsy:
    auto insert_or_update = [&db](auto&& key, auto&&... initializers) -> MyPodStruct& {
        // Be careful not to move twice: https://en.cppreference.com/w/cpp/container/map/emplace
        // > The element may be constructed even if there already is an element
        // > with the key in the container, in which case the newly constructed
        // > element will be destroyed immediately.
        if (auto insertion = db.emplace(pw, forward_as_tuple(key), std::tie(initializers...)); insertion.second) {
            return insertion.first->second;
        } else {
            return insertion.first->second = MyPodStruct(
                std::allocator_arg, 
                db.get_allocator(),
                std::forward<decltype(initializers)>(initializers)...); // forwarding ok here
        }
    };

    insert_or_update("two", 2,30, Bytes{});
    insert_or_update("nine", 9,100, Bytes{5,6});

    // partial updates:
    db.at(Shared::String("nine", mgr)).data.push_back(42);

    // For more efficient key lookups in the case of unlikely insertion, use
    // heterogeneous comparer, see https://stackoverflow.com/a/27330042/85371

    std::cout << "\n=== After updates\n" << db << std::endl;
}

Which prints Live On Coliru

=== Before updates
db has 3 elements: {one: 1,2, [1,2,]} {three: 3,4, [5,8,]} {two: 2,3, [4,]}

=== After updates
db has 4 elements: {nine: 9,100, [5,6,42,]} {one: 1,20, [7,8,9,]} {three: 3,4, [5,8,]} {two: 2,30, []}

Full Listing

For conservation: Live On Coliru

#include <boost/interprocess/containers/map.hpp>
#include <boost/interprocess/containers/string.hpp>
#include <boost/interprocess/containers/vector.hpp>
#include <boost/interprocess/managed_mapped_file.hpp>
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/container/scoped_allocator.hpp>
#include <iostream>

namespace bip = boost::interprocess;

namespace Shared {
    using Segment = bip::managed_mapped_file;
    using SMgr = Segment::segment_manager;

    template <typename T> using Alloc = bip::allocator<T, SMgr>;
    template <typename T> using ScopedAlloc = boost::container::scoped_allocator_adaptor<Alloc<T> >;

    using String = bip::basic_string<char, std::char_traits<char>, Alloc<char> >;

    using boost::interprocess::map;

    template <typename T> using Vec = 
        boost::container::vector<T, ScopedAlloc<T>>;

    template <typename K, typename T> using Map = 
        map<K, T, std::less<K>, ScopedAlloc<typename map<K, T>::value_type>>;

    struct MyPodStruct {
        using allocator_type = ScopedAlloc<char>;

        int a = 0; // simplify default constructor using NSMI
        int b = 0;
        Vec<uint8_t> data;

        explicit MyPodStruct(allocator_type alloc) : data(alloc) {}
        //MyPodStruct(MyPodStruct const&) = default;
        //MyPodStruct(MyPodStruct&&) = default;
        //MyPodStruct& operator=(MyPodStruct const&) = default;
        //MyPodStruct& operator=(MyPodStruct&&) = default;

        MyPodStruct(std::allocator_arg_t, allocator_type, MyPodStruct&& rhs) : MyPodStruct(std::move(rhs)) {}
        MyPodStruct(std::allocator_arg_t, allocator_type, MyPodStruct const& rhs) : MyPodStruct(rhs) {}

        template <typename I, typename A = Alloc<char>>
            MyPodStruct(std::allocator_arg_t, A alloc, int a, int b, I&& init)
             : MyPodStruct(a, b, Vec<uint8_t>(std::forward<I>(init), alloc)) { }

      private:
        explicit MyPodStruct(int a, int b, Vec<uint8_t> data) : a(a), b(b), data(std::move(data)) {}
    };    

    using Database = Map<String, MyPodStruct>;

    static inline std::ostream& operator<<(std::ostream& os, Database const& db) {
        os << "db has " << db.size() << " elements:";

        for (auto& [k,v] : db) {
            os << " {" << k << ": " << v.a << "," << v.b << ", [";
            for (unsigned i : v.data)
                os << i << ",";
            os << "]}";
        }

        return os;
    }
}

int main() {
    using Shared::MyPodStruct;
    Shared::Segment mf(bip::open_or_create, "test.bin", 10<<10); // smaller for Coliru
    auto mgr = mf.get_segment_manager();

    auto& db = *mf.find_or_construct<Shared::Database>("complex")(mgr);

    // Issues with brace-enclosed initializer list
    using Bytes = std::initializer_list<uint8_t>;

    // More magic: piecewise construction protocol :)
    static constexpr std::piecewise_construct_t pw{};
    using std::forward_as_tuple;
    db.emplace(pw, forward_as_tuple("one"), forward_as_tuple(1,2, Bytes {1,2}));
    db.emplace(pw, forward_as_tuple("two"), forward_as_tuple(2,3, Bytes {4}));
    db.emplace(pw, forward_as_tuple("three"), forward_as_tuple(3,4, Bytes {5,8}));

    std::cout << "\n=== Before updates\n" << db << std::endl;

    // Clumsy:
    db[Shared::String("one", mgr)] = MyPodStruct{std::allocator_arg, mgr, 1,20, Bytes {7,8,9}};

    // As efficient or better, and less clumsy:
    auto insert_or_update = [&db](auto&& key, auto&&... initializers) -> MyPodStruct& {
        // Be careful not to move twice: https://en.cppreference.com/w/cpp/container/map/emplace
        // > The element may be constructed even if there already is an element
        // > with the key in the container, in which case the newly constructed
        // > element will be destroyed immediately.
        if (auto insertion = db.emplace(pw, forward_as_tuple(key), std::tie(initializers...)); insertion.second) {
            return insertion.first->second;
        } else {
            return insertion.first->second = MyPodStruct(
                std::allocator_arg, 
                db.get_allocator(),
                std::forward<decltype(initializers)>(initializers)...); // forwarding ok here
        }
    };

    insert_or_update("two", 2,30, Bytes{});
    insert_or_update("nine", 9,100, Bytes{5,6});

    // partial updates:
    db.at(Shared::String("nine", mgr)).data.push_back(42);

    // For more efficient key lookups in the case of unlikely insertion, use
    // heterogeneous comparer, see https://stackoverflow.com/a/27330042/85371

    std::cout << "\n=== After updates\n" << db << std::endl;
}
sehe
  • 374,641
  • 47
  • 450
  • 633
  • 1
    Yes an example would be amazing, thanks. The uses_allocator protocol is difficult to understand by a non meta programmer like myself. So far, your examples seem to be the only good ones out there, you probably recognized the Shared namespace :). – johnco3 Aug 10 '19 at 22:46
  • 1
    I did recognize it, indeed; It never ceases to amaze me that so many brilliant programmers BUILD the library to have these awesome features, yet no-one seems to know/bother to show how to use them. – sehe Aug 10 '19 at 22:56
  • Added a demo [Live On Coliru](http://coliru.stacked-crooked.com/a/1af1ed2b26ccbc38). One warning: making it appear seemless is nice, but there is no silver bullet. The segment allocator has [allocation overhead](https://stackoverflow.com/a/21287572/85371), and you will do well to 1. plan for resizing segments, 2. optimize your allocations (use e.g. `small_vector` or `static_vector` instead of `basic_string<>` if you can, combine allocations by e.g. pooling). It's easy to end up writing worse a Java when you don't mind your step :) – sehe Aug 10 '19 at 23:15
  • fantastic example, don't know how you make the time, anyway in an attempt to come up to speed I modified the example with some comments & questions and an attempt to create a POD value in a map which fails for some reason. Perhaps you might get a minute to look into it. http://coliru.stacked-crooked.com/a/88685cab17f6b1fb – johnco3 Aug 11 '19 at 00:22
  • 1
    I think this fixes the points (It's ironic that it no longer fits the question as the struct doesn't use any allocator :)): [Live On Coliru](http://coliru.stacked-crooked.com/a/68e11db0f32d8f24). Note the comment [about efficiency](https://stackoverflow.com/a/27330042/85371), I like the observations you've been making in the comments. They're really instrumental to understanding this (yes, emplacement is required for uses-allocator construction, for example!). Note that some of the popular map access patterns like `m[k] = v;` are just not that well suited anymore. See the code suggestions. – sehe Aug 11 '19 at 01:12
  • this is great, it's 99% of what I need, the other 1% is that my POD struct contains a std::vector or probably better a boost::container::vector >; I kind of suspected that a POD struct didn't have an allocator (O stands for old after all :)) Again, I modified your updated source and added comments - http://coliru.stacked-crooked.com/a/64a2b5479a0ea0c2 (BTW how do you make the shortened name Live On Coliru link above. In either case, why not edit your answer with some of the updates from the POD and I'll accept it, great work my friend. – johnco3 Aug 11 '19 at 02:07
  • One thing I do not understand in the above answer was the line 'typename = std::enable_if_t, void>>' My meta programming skills are definitely sub-par, but I think this seems to be a specialization of the 'MyStruct(std::allocator_arg_t, allocator_type, MyStruct const& rhs) : data(rhs.data) {}' which kicks in when the 'init' trailing parmeter is not a MyStruct type and then perfect forwarding optimization kicks in as the last typename = void (as opposed to undefined) - is this correct? If so - how would one construct a MyStruct using this allocator – johnco3 Aug 11 '19 at 04:05
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/197787/discussion-between-sehe-and-johnco3). – sehe Aug 11 '19 at 11:09
  • 1
    Expanded the answer with the more complex use cases. [Demo On Coliru](http://coliru.stacked-crooked.com/a/8390fd2dbbbc60bc) – sehe Aug 11 '19 at 12:25
  • many thanks for resolving the question perfectly! That really helped me and I have learned a lot. – johnco3 Aug 11 '19 at 17:24
  • 1
    @johnco3 Cheers. You have learned a lot, and I know it took me some years to understand things at this level of "applicability". That said, at this point I think the resulting code is pretty contrived, and it might just become more sensible to write a set of accessor functions that do the necessary dance, perhaps without the magic of piecewise and or uses-allocator construction. It's good to know how they work, but there's considerable complexity involved and there's always a trade-off in terms of maintenance. Good luck! – sehe Aug 11 '19 at 17:55
  • sorry to be a pain, not sure why, but there seems to be a problem running on windows with boost 1.70. The only insertion of the 3 types you showed that works successfully was the 'clumsy:' variant. I tried single stepping through the code but it rapidly deteriorates into very complicated template code. Exception thrown at 0x00007FFE0222A839 in biptester.exe: Microsoft C++ exception: boost::interprocess::interprocess_exception at memory location 0x0000006A746FD9A0. Exception thrown: read access violation. *boost::forward**(...) returned 0x1234. – johnco3 Aug 12 '19 at 16:07
  • Are you using it with persistent files? Did it have enough capacity? MSVC runtime library is known to do more allocations – sehe Aug 12 '19 at 22:21
  • i passed 65536 to the shared:: segment constructor, very strange. I tried with both a managed_mapped file and managed_shared_memory - as I am aware that coliru only works with the former. – johnco3 Aug 13 '19 at 06:12
  • Any chance you could look into the 'windows read access violation', I'm way out of my depth trying to fix it, it doesn't appear to be a size related issue. I created a github project [here](https://github.com/johnco3/BIPStruct/issues/1) with details showing where the code breaks in boost's allocator. I also tried using the latest boost 1.71 but unsurprisingly, that did not work. I wonder if you could take a quick peek at it if you have time. – johnco3 Aug 21 '19 at 18:27
  • I might. But the best way would definitely be: 0. We timebox this :) I can probably afford to look an hour, maybe more if a solution seems attainable/interesting 1. you can provide a box that has MSVC (perhaps an Azure dev box?) and precise steps to repro from scratch 2. we wait till over the weekend because I'm travelling and bandwidth is limited – sehe Aug 21 '19 at 21:48
  • Thanks! I created an x64 Visual Studio solution (that requires a local Boost folder) in github today (I was away last week - thus the inactivity) - I've been adding commits as I discover/learn more about allocators/piecewise construction (mainly for my curiosity/learning) but I do have a real application that will use this as an inter-process DB. Meanwhile, I'll try to find a decent coliru equivalent that uses the MSVC compiler and has access to boost libs. – johnco3 Aug 21 '19 at 22:06
  • Rextester used to have it. I believe Godbolt Compiler Explorer added running capabilities recently. But environments like these will make it hard to actually debug what is happening. – sehe Aug 21 '19 at 22:19