9

I'm writing a container and would like to permit the user to use custom allocators, but I can't tell if I should pass allocators around by reference or by value.

Is it guaranteed (or at least, a reasonable assumption to make) that an allocator object will not contain its memory pool directly, and hence it would be OK to copy an allocator and expect the memory pools of the allocators to be cross-compatible? Or do I always need to pass allocators by reference?

(I have found that passing by reference harms performance by a factor of > 2 because the compiler starts worrying about aliasing, so it makes a whether or not I can rely on this assumption.)

user541686
  • 205,094
  • 128
  • 528
  • 886
  • Can you please add some specifics of what you're trying to do? You obviously have to be careful with stateful allocators, but certain standard operations such as move-constructing one container from another have standard idioms. – Kerrek SB Jul 28 '12 at 18:48
  • @KerrekSB: Ah, ok. For example, I'm trying to implement a `resize` or `reserve` operation operation as a "copy-this-container-and-swap" operation. This works, if the allocator doesn't have its own memory pool. (If it does, I could swap twice, but creating a copy might still be expensive and might overflow the stack if it contains its own pool.) – user541686 Jul 28 '12 at 19:45

2 Answers2

12

In C++11 section 17.6.3.5 Allocator requirements [allocator.requirements] specifies the requirements for conforming allocators. Among the requirements are:

X                    an Allocator class for type T
...
a, a1, a2            values of type X&
...
a1 == a2             bool          returns true only if storage
                                   allocated from each can be
                                   deallocated via the other.
                                   operator== shall be reflexive,
                                   symmetric, and transitive, and
                                   shall not exit via an exception.
...
X a1(a);                           Shall not exit via an exception.
                                   post: a1 == a

I.e. when you copy an allocator, the two copies are required to be able to delete each other's pointers.

Conceivably one could put internal buffers into allocators, but copies would have to keep a list of other's buffers. Or perhaps an allocator could have an invariant that deallocation is always a no-op because the pointer always comes from an internal buffer (either from your own, or from some other copy).

But whatever the scheme, copies must be "cross-compatible".

Update

Here is a C++11 conforming allocator that does the "short string optimization". To make it C++11 conforming, I had to put the "internal" buffer external to the allocator so that copies are equal:

#include <cstddef>

template <std::size_t N>
class arena
{
    static const std::size_t alignment = 16;
    alignas(alignment) char buf_[N];
    char* ptr_;

    std::size_t 
    align_up(std::size_t n) {return n + (alignment-1) & ~(alignment-1);}

public:
    arena() : ptr_(buf_) {}
    arena(const arena&) = delete;
    arena& operator=(const arena&) = delete;

    char* allocate(std::size_t n)
    {
        n = align_up(n);
        if (buf_ + N - ptr_ >= n)
        {
            char* r = ptr_;
            ptr_ += n;
            return r;
        }
        return static_cast<char*>(::operator new(n));
    }
    void deallocate(char* p, std::size_t n)
    {
        n = align_up(n);
        if (buf_ <= p && p < buf_ + N)
        {
            if (p + n == ptr_)
                ptr_ = p;
        }
        else
            ::operator delete(p);
    }
};

template <class T, std::size_t N>
class stack_allocator
{
    arena<N>& a_;
public:
    typedef T value_type;

public:
    template <class U> struct rebind {typedef stack_allocator<U, N> other;};

    explicit stack_allocator(arena<N>& a) : a_(a) {}
    template <class U>
        stack_allocator(const stack_allocator<U, N>& a)
            : a_(a.a_) {}
    stack_allocator(const stack_allocator&) = default;
    stack_allocator& operator=(const stack_allocator&) = delete;

    T* allocate(std::size_t n)
    {
        return reinterpret_cast<T*>(a_.allocate(n*sizeof(T)));
    }
    void deallocate(T* p, std::size_t n)
    {
        a_.deallocate(reinterpret_cast<char*>(p), n*sizeof(T));
    }

    template <class T1, std::size_t N1, class U, std::size_t M>
    friend
    bool
    operator==(const stack_allocator<T1, N1>& x, const stack_allocator<U, M>& y);

    template <class U, std::size_t M> friend class stack_allocator;
};

template <class T, std::size_t N, class U, std::size_t M>
bool
operator==(const stack_allocator<T, N>& x, const stack_allocator<U, M>& y)
{
    return N == M && &x.a_ == &y.a_;
}

template <class T, std::size_t N, class U, std::size_t M>
bool
operator!=(const stack_allocator<T, N>& x, const stack_allocator<U, M>& y)
{
    return !(x == y);
}

It could be used like this:

#include <vector>

template <class T, std::size_t N> using A = stack_allocator<T, N>;
template <class T, std::size_t N> using Vector = std::vector<T, stack_allocator<T, N>>;

int main()
{
    const std::size_t N = 1024;
    arena<N> a;
    Vector<int, N> v{A<int, N>(a)};
    v.reserve(100);
    for (int i = 0; i < 100; ++i)
        v.push_back(i);
    Vector<int, N> v2 = std::move(v);
    v = v2;
}

All allocations for the above problem are drawn from the local arena which is 1 Kb in size. You should be able to pass this allocator around by value or by reference.

Howard Hinnant
  • 206,506
  • 52
  • 449
  • 577
  • +1 indeed, that answers my question directly. (And your other answers on SO are also very helpful, btw.) Thanks a bunch! :) – user541686 Jul 28 '12 at 19:48
  • @Howard: Hmm, on another note, I'm not sure "cross-compatible" is sufficiently clear for me. If I copy an allocator whose deallocation routine is a no-op (it contains its own buffer), then all of its operations would be "cross-compatible". But then if the copied allocator is destroyed by going out of scope, what happens to the items that were allocated with it? – user541686 Jul 28 '12 at 20:03
  • They would be invalidated. I believe it is incumbent upon the allocator author to somehow just make this work. I've updated my answer with one example of how an allocator author might make it work. – Howard Hinnant Jul 28 '12 at 23:00
  • Typo: `(alignment-1) & ~(alignment-1)`? – Luc Danton Jul 29 '12 at 08:42
  • 1
    @LucDanton: I did have a type-o in `align_up`. But I corrected it several hours before your comment. So I'm not positive if you're looking at an older buggy version of align_up, or if you're questioning the current environment. `{return n + (alignment-1) & ~(alignment-1);}` is the correct expression as far as I know. I've just briefly retested it, and it is behaving as I intend. – Howard Hinnant Jul 29 '12 at 16:01
  • Oh you forgot to `@` me so I didn't get notified! Interesting, thanks for the info! – user541686 Jul 29 '12 at 23:38
  • 1
    Why is calling `v.reserve(100)` still necessary? I thought it was only to avoid heap allocations. I'm surprised calling reserve() with stack allocated memory still gives a large boost. In my draughts engine, using the stack allocator + reserve() gives a 25% performance difference compared to just using the stack allocator or regular vector + reserve(). – TemplateRex Aug 01 '12 at 19:07
  • 3
    Calling `v.reserve(100)` isn't necessary. It is just a space and time optimization. If you don't call `reserve`, you will run out of stack space faster as the `arena` doesn't try to reclaim space except in the case that the deallocation is the last thing allocated. As the vector grows from space for 2 elements to 4 (for example), the space for 4 elements is allocated before the space for 2 elements is deallocated. So the former is permanently lost. – Howard Hinnant Aug 01 '12 at 19:59
  • @Howard Tnx, I understand now. In any case, std::vector + your allocator + v.reserve() has almost equal performance (~1-2% difference) to std::array, even when called a million times per second in tight inner loop. – TemplateRex Aug 02 '12 at 13:18
  • the memory cannot be reused unless deallocation is followed by allocation. For instance, p1 = sa.allocate(100); followd by p2 = sa.allocate(1); now, sa.deallocate(p1) will not actually work. – Frahm Apr 11 '13 at 07:51
  • Why the alignment is on 16 byte boundary? Also,what's the correct replacement for alignas in MSVC? – Michael IV Aug 16 '15 at 10:04
  • 1
    @MichaelIV: When I wrote this, I did so on OS X / iOS and 16 byte was the maximal alignment, and `std::max_align_t` had not yet been standardized. Today I would use `std::max_align_t` instead of 16. MSVC `alignas` workaround: I'm not a VS expert, but `__declspec(align(16))` looks promising. – Howard Hinnant Aug 16 '15 at 14:57
11

The old C++ standard makes requirements for a standard-compliant allocator: These requirements include that if you have Alloc<T> a, b, then a == b, and you can use b to deallocate things that were allocated with a. Allocators are fundamentally stateless.


In C++11, the situation got a lot more involved, as there is now support for stateful allocators. As you copy and move objects around, there are specific rules whether one container can be copied or moved from another container if the allocators differ, and how the allocators get copied or moved.

Just to answer your question first: No, you can definitely not assume that it makes sense to copy your allocator around, and your allocator may not even be copyable.

Here is 23.2.1/7 on this subject:

Unless otherwise specified, all containers defined in this clause obtain memory using an allocator (see 17.6.3.5). Copy constructors for these container types obtain an allocator by calling allocator_traits<allocator_-type>::select_on_container_copy_construction on their first parameters. Move constructors obtain an allocator by move construction from the allocator belonging to the container being moved. Such move construction of the allocator shall not exit via an exception. All other constructors for these container types take an Allocator& argument (17.6.3.5), an allocator whose value type is the same as the container’s value type. [Note: If an invocation of a constructor uses the default value of an optional allocator argument, then the Allocator type must support value initialization. —end note] A copy of this allocator is used for any memory allocation performed, by these constructors and by all member functions, during the lifetime of each container object or until the allocator is replaced. The allocator may be replaced only via assignment or swap(). Allocator replacement is performed by copy assignment, move assignment, or swapping of the allocator only if allocator_traits<allocator_type>::propagate_on_container_copy_assignment::value, allocator_traits<allocator_type>::propagate_on_container_move_assignment::value, or allocator_traits<allocator_type>::propagate_on_container_swap::value is true within the implementation of the corresponding container operation. The behavior of a call to a container’s swap function is undefined unless the objects being swapped have allocators that compare equal or allocator_traits<allocator_type>::propagate_on_container_swap::value is true. In all container types defined in this Clause, the member get_allocator() returns a copy of the allocator used to construct the container or, if that allocator has been replaced, a copy of the most recent replacement.

See also the documentation of std::allocator_traits for a synopsis.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • Oohhhhhhh... so there's a [`scoped_allocator_adaptor`](http://en.cppreference.com/w/cpp/memory/scoped_allocator_adaptor) specifically for this purpose? I guess that means I can't just blindly copy them. :( +1 awesome answer, thanks! – user541686 Jul 28 '12 at 18:49
  • 1
    @Mehrdad: The scoped thing is for situations where you have containers of containers and you want your custom allocator to be used everywhere all the way down. – Kerrek SB Jul 28 '12 at 18:50