5

I have a pointer-like struct that goes in the place of a pointer. The difference with a pointer is that it has extra information that the (also special) allocator can use to deallocate the memory.

This pointer-like structure works well for all basic uses. I can allocate and deallocate memory, dereferrence, increment,->, etc.

Now I want to use this pointers to be managed by a STL-like container. Early on, I realized that STL vector basically cannot handle non-raw pointers. T* is too hard coded, and the standard basically rules out anything that is not a pointer.

Inspired by Boost.Interprocess' offset_ptr<T> I decided to use Boost.Container vector, which is very customizable and in principle can manage anything, the allocator passed to the boost::container::vector can handle anything that is pointer-like.

Now the class boost::container::vector<T, myallocator_with_special_pointer<T>> can do anything... except resize()!!

Looking at the code in boost/container/vector.hpp it seems that the process of resizing (which is basically and allocation, followed by a copy (or move) and deallocation) involves raw pointers.

The offending line is:

  [line 2729:] T * const new_buf = container_detail::to_raw_pointer
     (allocator_traits_type::allocate(this->m_holder.alloc(), new_cap, this->m_holder.m_start));

Which is later followed by

  [line 3022:] this->m_holder.start(new_start);  // new_start is the same as new_buf above. 
  // member ::start(pointer&) will need to convert a raw pointer to the pointer typedef.

Both lines absolutely kill the possibility of using anything that is not a raw_pointer. Even if I have a conversion operator to a raw pointer, other information about the special pointer will be lost.

It seems pretty silly that this small detail forbids the use of non-raw pointers. Given all the effort for the container to be general (e.g. defining the pointer typedef), why this portion of the code uses T* just for resizing?

In other words, why Boost Container doesn't use this line instead

  [alternative] pointer const new_buf = 
     allocator_traits_type::allocate(this->m_holder.alloc(), new_cap, this->m_holder.m_start);

Is there a workaround or an alternative way to use Boost Container vector to handle non-raw pointers?

Boost.Container says in its manual page http://www.boost.org/doc/libs/1_64_0/doc/html/container/history_and_reasons.html#container.history_and_reasons.Why_boost_container

Boost.Container is a product of a long development effort that started in 2004 with the experimental Shmem library, which pioneered the use of standard containers in shared memory. Shmem included modified SGI STL container code tweaked to support non-raw allocator::pointer types and stateful allocators. Once reviewed, Shmem was accepted as Boost.Interprocess and this library continued to refine and improve those containers.

The current implementation (in the context of resize) goes against this design goal.


I asked a less specific question here, about other traits of the allocators: Is it still possible to customize STL vector's "reference" type?


For reference the allocator that specifies the special pointer (which is propagated to the container) is something like this,

template<class T>
struct allocator{
    using value_type = T;
    using pointer = array_ptr<T>; // simulates T*
    using const_pointer = array_ptr<T const>; // simulates T const*
    using void_pointer = array_ptr<void>; // simulates void*
    using const_void_pointer = array_ptr<void const>; // simulates void const*
    some_managed_shared_memory& msm_;
    allocator(some_managed_shared_memory& msm) : msm_(msm){}
    array_ptr<T> allocate(mpi3::size_t n){
        auto ret = msm_.allocate(n*sizeof(T));
        return static_cast<array_ptr<T>>(ret);
    }
    void deallocate(array_ptr<T> ptr, mpi3::size_t = 0){
        msm_.deallocate(ptr);
    }
};

Full working code http://coliru.stacked-crooked.com/a/f43b6096f9464cbf

#include<iostream>
#include <boost/container/vector.hpp>

template<typename T>
struct array_ptr;

template<>
struct array_ptr<void> {
    using T = void;
    T* p;
    int i; //some additional information

//    T& operator*() const { return *p; }
    T* operator->() const { return p; }

//    operator T*() const { return p; }
    template<class TT>
    operator array_ptr<TT>() const{return array_ptr<TT>((TT*)p, i);}
    operator bool() const{return p;}
    array_ptr(){}
    array_ptr(std::nullptr_t) : p(nullptr){}
    array_ptr(T* ptr, int _i) : p(ptr), i(_i){}
    template<class Other>
    array_ptr(array_ptr<Other> other) : p(other.p), i(other.i){}
};

template<>
struct array_ptr<void const> {
    using T = void const;
    T* p;
    int i; //some additional information

//    T& operator*() const { return *p; }
    T* operator->() const { return p; }

    operator T*() const { return p; }
    array_ptr(){}
    array_ptr(std::nullptr_t) : p(nullptr){}
    array_ptr(T* ptr, int _i) : p(ptr), i(_i){}
    template<class Other>
    array_ptr(array_ptr<Other> other) : p(other.p), i(other.i){}
};

template<typename T>
struct array_ptr {
    T* p;
    int i; //some additional information

    T& operator*() const { return *p; }
    T* operator->() const { return p; }
    T& operator[](std::size_t n) const{
        assert(i == 99);
        return *(p + n);
    }
    bool operator==(array_ptr const& other) const{return p == other.p and i == other.i;}
    bool operator!=(array_ptr const& other) const{return not((*this)==other);}

//    operator T*() const { return p; }
    array_ptr& operator++(){++p; return *this;}
    array_ptr& operator+=(std::ptrdiff_t n){p+=n; return *this;}
    array_ptr& operator-=(std::ptrdiff_t n){p-=n; return *this;}
    array_ptr operator+(std::size_t n) const{array_ptr ret(*this); ret+=n; return ret;}
    std::ptrdiff_t operator-(array_ptr const& other) const{return p - other.p;}
    array_ptr(){}
    array_ptr(std::nullptr_t) : p(nullptr), i(0){}

    operator bool() const{return p;}

    array_ptr(T* ptr, int _i) : p(ptr), i(_i){}
    array_ptr(T* ptr) : p(ptr), i(0){}
    array_ptr(int) : p(nullptr), i(0){}
    array_ptr(array_ptr<void> const& other) : p(static_cast<T*>(other.p)), i(other.i){}
};

struct some_managed_shared_memory {
    array_ptr<void> allocate(size_t n) { return array_ptr<void>(::malloc(n), 99); }
    void  deallocate(array_ptr<void> ptr) { if (ptr) ::free(ptr.p); }
};

template<typename T>
struct allocator{
    using value_type = T;
    using pointer = array_ptr<T>; // simulates T*
    using const_pointer = array_ptr<T const>; // simulates T const*
    using void_pointer = array_ptr<void>; // simulates void*
    using const_void_pointer = array_ptr<void const>; // simulates void const*

    some_managed_shared_memory& msm_;
    allocator(some_managed_shared_memory& msm) : msm_(msm){}
    array_ptr<T> allocate(size_t n){
        auto ret = msm_.allocate(n*sizeof(T));
        return static_cast<array_ptr<T>>(ret);
    }
    void deallocate(array_ptr<T> ptr, std::size_t = 0){
        msm_.deallocate(ptr);
    }
};

int main() {
    some_managed_shared_memory realm;
    boost::container::vector<int, allocator<int> > v(10, realm);
    assert( v[4] == 0 );
    v[4] = 1;
    assert( v[4] == 1 );
    for(std::size_t i = 0; i != v.size(); ++i) std::cout << v[i] << std::endl;
    for(auto it = v.begin(); it != v.end(); ++it) std::cout << *it << std::endl;

    // none of these compile:
    v.push_back(8);
    assert(v.size() == 11);
    v.resize(100);
    std::cout << v[89] << std::endl; // will fail an assert because the allocator information is lost
    //v.assign({1,2,3,4,5});
}
alfC
  • 14,261
  • 4
  • 67
  • 118
  • 2
    I think this is a QoI bug that should be addressed to the maintainers of Boost Container. I think I've seen them fix glitches like this before. – sehe Jul 06 '17 at 21:42
  • @sehe That reassures me that I was not completely off. I wonder how `boost::interprocess::vector` (which I think also uses Boost Container) works around this problem. (the associated code is very difficult to read,) Perhaps it is easier to workaround this for `offset_ptr`'s. Also it doesn't look like a "simple" bug because `to_raw_pointer` is called explicitly. It maybe an incomplete half baked design. – alfC Jul 07 '17 at 00:10
  • @sehe, interestingly the "offending" line is in a function called `priv_forward_range_insert_no_capacity`, for which there are three overloads, dispatched by a third argument with the types `version_0` (which throws unconditionally in the body function), `version_1` (which seems to simply allocate and deallocate, using pointer `T*`) and `version_2` (which seems to do a more fancy move, also using pointers `T*`). I wonder if I have to define the allocator class or its traits in such a way that these functions are *not* instantiated. See my edit. – alfC Jul 07 '17 at 00:26
  • 1
    When I try to piece together a minimal example with the allocator you added, I end up with many more problems: [Live SSCCE](http://coliru.stacked-crooked.com/a/8cf5a6ac836009ab). Can you make it a self contained sample that demonstrates the issue? – sehe Jul 07 '17 at 07:17
  • @sehe Your minimal example is excellent. Let's start from what works, if I declare the vector with an initial non-zero size, element access and dereference works. `push_back` or `assign` instatiates the code of `resize` internally, so I am not surprised they didn't work either. I added the full working code (based in your MWE), as you can see I have to define specialization for `array_ptr` which cannot be dereferenced and I need to include some implicit conversions (analogous to conversion for `int*` to `void*`). Thank you for your help. – alfC Jul 07 '17 at 12:52
  • @sehe, I improved the MWE to make it work with `.begin()` and `.end()` and `operator[]`. Once again the obstacle is `.resize()`. In this simplified version it is more obvious that in line 714 of `vector.hpp` there is a default that is propogated as `T*` instead of as `pointer`. – alfC Jul 07 '17 at 13:54
  • @sehe, I was able to make it compile with `resize` and `push_back`. The code compiles and works, but it is semantically incorrect because the extra information added by the allocator is lost in the process of resizing (in the example, allocator add the information `99`, but due to the conversion this information is ultimately lost). – alfC Jul 07 '17 at 16:01
  • I reached similar conclusions in my answer that I posted in the meantime. However, some of the "data loss" reservations already apply at the pointer arithmetic operations (see my answer). This could make it an intrinsic design issue with your allocator, as opposed to something just introduced by the usage patterns in Boost Container only. – sehe Jul 07 '17 at 16:33
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/148625/discussion-between-alfc-and-sehe). – alfC Jul 07 '17 at 16:51

1 Answers1

2

I looked into things.

The TL;DR seems to be: non-raw pointers are supported, but they need a implicit conversion from raw in some operations. Whether or not this is by design, I don't know, but it doesn't seem to contradict the design goal.

In fact this is very analogous to the history of allocator support: STL containers had support for custom allocators, but not for stateful allocators (meaning, non-default-constructible allocator types).

Allocator Versions

At first I tried some of the allocator versions:

using version = boost::container::version_0; // seems unsupported, really
using version = boost::container::version_1;
using version = boost::container::version_2; // does different operations

But it had no (decisive) effect. Maybe the documentation has clues.

Pointer Arithmetic

After that I looked into the specific errors. Looking at the cited line/error it dawned on me that the raw-pointer might have been an accident. Looking at the output of these:

std::cout << boost::container::container_detail::impl::version<allocator<int> >::value << "\n";

array_ptr<int> p;
auto rawp = boost::container::container_detail::to_raw_pointer(p);
std::cout << typeid(rawp).name() << "\n";

std::cout << typeid(p).name() << "\n";
std::cout << typeid(p + 5).name() << "\n";
std::cout << typeid(p - 5).name() << "\n";

Shows something like¹

1
int*
array_ptr<int>
int*
int*

¹ prettified with the help of c++filt -t

This lead me to define pointer arithmetic:

template <typename T, typename N>
array_ptr<T> operator+(array_ptr<T> const& p, N n) { return array_ptr<T>(p.p+n, p.i); }

template <typename T>
array_ptr<T>& operator++(array_ptr<T>& p) { return ++p.p, p; }

template <typename T>
array_ptr<T> operator++(array_ptr<T>& p, int) { auto q = p.p++; return array_ptr<T>(q, p.i); }

template <typename T, typename N>
array_ptr<T> operator-(array_ptr<T> const& p, N n) { return array_ptr<T>(p.p-n, p.i); }

template <typename T>
ptrdiff_t operator-(array_ptr<T> const& a, array_ptr<T> const& b) { return a.p - b.p; }

Now the output becomes

1
int*
array_ptr<int>
array_ptr<int>
array_ptr<int>

Many more use cases compile successfully with these definitions. Assuming that the "annotation" data inside the array_pointer is valid after increment, it should not lose any allocator information

The Real Culprit

With that out of the way, some things still don't compile. Specifically, in some spots the allocator's pointer type is constructed back from a raw-pointer. This fails because there's no suitable "default" conversion constructor. If you declare the constructors with the data value optional, everything compiles, but you could argue that this loses information as there is a path from

 array_pointer<T> p;
 auto* rawp = to_raw_pointer(p);
 array_pointer<T> clone(rawp); // oops lost the extra info in p

OBSERVATION

Note that, as you apparently realized (judging from the commented operators), adding the default constructor argument removes the need for the arithmetic operations (except pre-increment).

However, adding them makes sure that the lossy conversion path is taken less often, which could be important to your use case.

DEMO TIME

Live On Coliru

#if COMPILATION_INSTRUCTIONS
clang++ -std=c++14 -Wall -Wfatal-errors $0 -o $0x.x && $0x.x $@ && rm -f $0x.x; exit
#endif

#define DEFAULT_DATA = 0
#define DEFINE_ARITHMETIC_OPERATIONS

#include <iostream>
#include <boost/container/vector.hpp>
#include <typeinfo>

template<typename T>
struct array_ptr {
    T* p;
    int i; //some additional information

    T& operator*() const { return *p; }
    T* operator->() const { return p; }

    operator T*() const { return p; }

    array_ptr(){}
    //array_ptr(std::nullptr_t) : p(nullptr), i(0){}
    array_ptr(T* ptr, int _i DEFAULT_DATA) : p(ptr), i(_i){}

};

template<>
struct array_ptr<void> {
    using T = void;
    T* p;
    int i; //some additional information

//    T& operator*() const { return *p; }
    T* operator->() const { return p; }

    operator T*() const { return p; }
    template<class T>
    operator array_ptr<T>() const{return array_ptr<T>((T*)p, i);}
//    array_ptr& operator++(){++p; return *this;}
    array_ptr(){}
    array_ptr(std::nullptr_t) : p(nullptr){}
    array_ptr(T* ptr, int _i DEFAULT_DATA) : p(ptr), i(_i){}
    template<class Other>
    array_ptr(array_ptr<Other> other) : p(other.p), i(other.i){}
};

template<>
struct array_ptr<void const> {
    using T = void const;
    T* p;
    int i; //some additional information

//    T& operator*() const { return *p; }
    T* operator->() const { return p; }

    operator T*() const { return p; }
//    array_ptr& operator++(){++p; return *this;}
//  template<class Other> array_ptr(array_ptr<Other> const& other) : p(other.p), i(other.i){}
    array_ptr(){}
    array_ptr(std::nullptr_t) : p(nullptr){}
    array_ptr(T* ptr, int _i DEFAULT_DATA) : p(ptr), i(_i){}
    template<class Other>
    array_ptr(array_ptr<Other> other) : p(other.p), i(other.i){}
};

struct some_managed_shared_memory {
    array_ptr<void> allocate(size_t n) { return array_ptr<void>(::malloc(n), 99); }
    void  deallocate(array_ptr<void> ptr) { if (ptr) ::free(ptr.p); }
};

template<typename T>
struct allocator{
    using version = boost::container::version_1;

    using value_type = T;
    using pointer = array_ptr<T>; // simulates T*
    using const_pointer = array_ptr<T const>; // simulates T const*
    using void_pointer = array_ptr<void>; // simulates void*
    using const_void_pointer = array_ptr<void const>; // simulates void const*

    some_managed_shared_memory& msm_;
    allocator(some_managed_shared_memory& msm) : msm_(msm){}
    array_ptr<T> allocate(size_t n){
        auto ret = msm_.allocate(n*sizeof(T));
        return static_cast<array_ptr<T>>(ret);
    }
    void deallocate(array_ptr<T> ptr, std::size_t = 0){
        msm_.deallocate(ptr);
    }
};

#ifdef DEFINE_ARITHMETIC_OPERATIONS
    template <typename T, typename N>
    array_ptr<T> operator+(array_ptr<T> const& p, N n) { return array_ptr<T>(p.p+n, p.i); }

    template <typename T>
    array_ptr<T>& operator++(array_ptr<T>& p) { return ++p.p, p; }

    template <typename T>
    array_ptr<T> operator++(array_ptr<T>& p, int) { auto q = p.p++; return array_ptr<T>(q, p.i); }

    template <typename T, typename N>
    array_ptr<T> operator-(array_ptr<T> const& p, N n) { return array_ptr<T>(p.p-n, p.i); }

    template <typename T>
    ptrdiff_t operator-(array_ptr<T> const& a, array_ptr<T> const& b) { return a.p - b.p; }
#endif


int main() {
    std::cout << boost::container::container_detail::impl::version<allocator<int> >::value << "\n";

    if (1) { // some diagnostics
        array_ptr<int> p;
        auto rawp = boost::container::container_detail::to_raw_pointer(p);
        std::cout << typeid(rawp).name() << "\n";

        std::cout << typeid(p).name() << "\n";
        std::cout << typeid(p + 5).name() << "\n";
        std::cout << typeid(p - 5).name() << "\n";
    }

    some_managed_shared_memory realm;
    boost::container::vector<int, allocator<int> > v(10, realm);
    assert( v[4] == 0 );
    v[4] = 1;
    assert( v[4] == 1 );
    for(std::size_t i = 0; i != v.size(); ++i) std::cout << v[i] << std::endl;

    // these compile:
    v.push_back(12);
    v.resize(100);
    v.assign({1,2,3,4,5});
}

Prints

1
Pi
9array_ptrIiE
9array_ptrIiE
9array_ptrIiE
0
0
0
0
1
0
0
0
0
0
sehe
  • 374,641
  • 47
  • 450
  • 633
  • yes, having the conversion operator forces one to explicitly define the arithmetic operators, otherwise there is loss of information there. I think I fixed that issue, however the problem persists, all the paths in the `resize` function do a conversion to raw pointer `T*` and then back to the `pointer`. Even in your code, if you add a check upon deallocation `void deallocate(array_ptr ptr) { if (ptr){ assert(ptr.i == 99); ::free(ptr.p); } } ` you will see that information if lost (after resize). – alfC Jul 07 '17 at 16:47
  • I think we need some sort of trick used in `interprocess::offset_ptr` http://www.boost.org/doc/libs/1_64_0/doc/html/interprocess/offset_ptr.html. In the case of `offset_ptr` it seems that the `offset` information is put in the pointer itself and then reconstructed in the assignment (not currently in our code). I tried doing that but not without generating a segmentation fault. I don't know why they made the resize code so complicated and crippling. I think they tried to optimize the code to exploit some memory move but in the process they made the container less generic. – alfC Jul 07 '17 at 16:51
  • I'm aware of the loss if `ptr.i` - I pointed it out in the answer. I assumed that `offset_ptr` would leave the pointer unconverted so the segment_manager can do the conversion. I didn't grok the resize logic to know whether it was "unnecessarily" complex (interesting, I'll look it up later). My gut feeling says that it's intrinsic complexity required to get correct exception-safety semantics. Standard library containers are deceptively simple. – sehe Jul 07 '17 at 17:53
  • Related: a formal specification of the problem and the road ahead http://quuxplusone.github.io/draft/fancy-pointers.html – alfC Sep 23 '17 at 10:03
  • 1
    Although, the situation was somewhat negative regarding Boost, this answer was a gold nugget of knowledge. Years later this allowed me to write this library, in particular with this powerful feature: https://gitlab.com/correaa/boost-multi/-/tree/master#special-memory-allocators-and-fancy-pointers – alfC May 19 '22 at 07:00
  • Cheers @alfC I do appreciate hearing the feedback and seeing the interesting libraries you're creating! – sehe May 19 '22 at 10:11