5

I'd like to increase the performance of a specific usage of list and map, where the number of items has a hard limit in the order of 100000. The STL default allocator obviously isn't the best choice in this situation, since cleaning up all the thousands of small objects takes a very long time (>10sec!). Not to mention all the other potential issues.

So obviously to improve this I can preallocate the correct amount of memory to contain all of the list/map nodes. So far I've been able to implement a working version of the default allocator (by deriving from std::allocator_traits), that uses alloc/free for each node. But I'm struggling to find out how to modify this to allow for the "stateful" use of, for example, my very simple stack:

using namespace std;
class MemPoolStack
{
public:
    size_t Size;
    size_t Mult;
    size_t Total;
    size_t Top;
    size_t Last;
    unique_ptr<byte[]> Data;
    unique_ptr<size_t[]> Nexts;

    MemPoolStack(size_t size, size_t mult) :
        Size(size),
        Mult(mult),
        Total(size * mult),
        Top(0),
        Last(0),
        Data(new byte[Total]),
        Nexts(new size_t[Size])
    {
    }
    size_t& Next(size_t i)
    {
        return *(Nexts.get() + i);
    }
    void* Pop()
    {
        byte* p = nullptr;
        if(Top<Size)
        {
            p = Data.get() + (Top * Mult);
            bool last = (Top==Last);
            size_t next = last ? Top+1 : Next(Top);
            if(last) Next(Top) = next;
            Top = next;
            if(Top>Last) Last=Top;
        }
        else
        {
            p = nullptr;
        }
        return p;
    }
    bool Push(void* p)
    {
        ptrdiff_t diff = (byte*)p - Data.get();
        size_t index = ((size_t)diff / Mult);
        if(diff>=0 && index<Size)
        {
            Next(index) = Top;
            Top = index;
            return true;
        }
        return false;
    }
};

template <class T> struct MemPool
{
    typedef T value_type;
    MemPool() throw() {}
    template <class U> MemPool (const MemPool<U>&) throw() {}
    template <class U> struct rebind { typedef MemPool<U> other; }; //off-topic: why doesn't allocator_traits define this?
    T* allocate (size_t n) 
    {
        return static_cast<T*>(malloc(n*sizeof(T))); 
    }
    void deallocate (T* p, size_t n) 
    { 
        free(p); 
    }
};

template <class T, class U>
bool operator== (const MemPool<T>&, const MemPool<U>&) throw()
{return true;}

template <class T, class U>
bool operator!= (const MemPool<T>&, const MemPool<U>&) throw()
{return false;}

And I'm instantiating my list and map like this:

list<TKey, MemPool<TKey>> Keys;
map<TKey, MapType, less<TKey>, MemPool<MapType>> Map;

The MemPoolStack itself isn't really the issue here, it probably has bugs but it's just for example purposes. The point is that the MemPoolStack class stores a unique_ptr to the preallocated memory, and some other member variables.

The problem there is that I need to have some reference to my MemPoolStack in the MemPool class, so that all the different ways that a Visual C++11 map or list can construct my allocator all end up with one MemPoolStack instance per list or map. Then I could use MemPoolStack::Pop() in MemPool::allocate(), and MemPoolStack::Push() in MemPool::deallocate().

I also need a way to initially construct my allocator, specifying the size. I tried putting a shared_ptr<MemPoolStack> in MemPool but it ended up getting lost when the list decided to call the allocator's default constructor...

I'm also open to throwing away all of this code for a good alternative solution to the original problem.

dexy
  • 293
  • 2
  • 10
  • 1
    possible duplicate of [Looking for C++ STL-like vector class but using stack storage](http://stackoverflow.com/questions/354442/looking-for-c-stl-like-vector-class-but-using-stack-storage) – Mgetz Dec 18 '13 at 13:23
  • You don't need the stack. If you are good with a single block of stack memory, then you are good with a single block of heap. – David Heffernan Dec 18 '13 at 13:24
  • 2
    STL allocators are stateless. All instances of any given allocator type must be equivalent to all other instances of that type. Sorry, that's the law of the land. – n. m. could be an AI Dec 18 '13 at 13:31
  • If you want stateful allocators, try Boost collections, I hear they explicitly support them. – n. m. could be an AI Dec 18 '13 at 13:33
  • 5
    "Custom allocators may contain state. Each container or another allocator-aware object stores an instance of the supplied allocator and controls allocator replacement through std::allocator_traits. (since C++11)" (from [the reference](http://en.cppreference.com/w/cpp/memory/allocator)) – dexy Dec 18 '13 at 13:39
  • I'm just not sure how to implement said state. – dexy Dec 18 '13 at 13:39
  • @n.m. - that was the law of the land **before** C++11. – Pete Becker Dec 18 '13 at 14:08
  • you are probably using STL wrong if it takes 10s to deallocate thousands of objects, not to mention that probably u should not be using list.... paste a small example where you experience your perf problems... – NoSenseEtAl Dec 18 '13 at 17:38
  • @NoSenseEtAl I'm implementing an LRU cache which requires storing hundreds of thousands of POD's (which are actually stored in a vector), indexed by a small key (32 bytes). The map is used as the primary lookup while the list serves as the LRU list. I don't really want to implement my own map and list, so I need to adapt the STL ones via an allocator to be more efficient for my use case. – dexy Dec 18 '13 at 17:54
  • can you sort the vector in a sense is that it is not being updated while lookups are running, or no? – NoSenseEtAl Dec 18 '13 at 18:02
  • also check out boost circular buffer for LRU if that suits your needs – NoSenseEtAl Dec 18 '13 at 18:06
  • @PeteBecker I stand corrected. – n. m. could be an AI Dec 18 '13 at 18:14
  • @NoSenseEtAl Well, essentially this is a circular buffer (LRU cache with fixed size). I'd prefer not to be moving around my data all the time (i.e a pointer to an object in the cache should remain the same), so the data goes in a fixed-size vector. But I still need the map and linked list functionality to make the cache work. I'm also a bit of a purist at the moment and this project will not be using Boost. – dexy Dec 19 '13 at 00:18

2 Answers2

3

Since you want a single underlying pool, and allocators can be copied and re-bound, you can't store your state directly in the allocator.

What you can do is store a pointer (or a shared_ptr) to your state, such that copies of your allocator shallow-copy the pointer, referring to the same underlying pool.

Note that you either need to write a default constructor for your allocator, and have it create a new backing pool, or you need to create an allocator instance with a specific backing pool and pass it to the container constructor.

So this:

list<TKey, MemPool<TKey>> Keys;

will default construct an allocator (which will be something like MemPool<list<TKey>::node>), and that allocator instance will have to create its own backing state; while this:

list<TKey, MemPool<TKey>> MoreKeys(Keys);

will copy that original allocator instance via a select_on_container_copy_construction() const method you must provide (so you can make both containers, with their separate allocator instances, share the same pool); and finally this:

map<TKey, MapType, less<TKey>, MemPool<MapType>> Map(MemPool<MapType>(my_pool));

will use the specified backing pool.

Useless
  • 64,155
  • 6
  • 88
  • 132
  • This is essentially my line of thinking, that my `MemPool` needs a `shared_ptr` that can be reused when the allocator is copied (but I'm not intending on sharing allocators between containers). I also need a constructor `MemPool(size_t count)` or similar which can do the initial allocation of the shared_ptr. But how do I handle the default constructor case of `MemPool()`? i.e. where do I get the shared_ptr from? – dexy Dec 18 '13 at 14:10
  • I did try exactly this already, passing the allocator instance into the list constructor, as in `list> Keys(MemPool(100000))`, but for some reason during that construction, `MemPool()` is being called. – dexy Dec 18 '13 at 14:12
  • I also tried `list>> Keys(MemPool<_List_node>(100000))` to avoid the rebind, but it didn't help. – dexy Dec 18 '13 at 14:20
  • 3
    If you delete `MemPool::MemPool()`, where does it fail to compile? Or alternatively, have it call `abort` and examine your call stack in a debug build. You should be able to figure out why it's getting called. – Useless Dec 18 '13 at 14:27
  • 1
    Good call. I have managed to get rid of this default constructor now, I found another piece of code of mine lurking that was somehow doing it. With the culprit removed, I've now got both the shared_ptr and my custom constructor working! I'll post the full code in another answer and mark yours as answer since using shared_ptr definitely is the way to go. Thanks! – dexy Dec 18 '13 at 15:06
1

OK, so I've gotten this working, once my brain cells were fired into action thanks to Useless.

Here's the code of the allocator (I've omitted the MemPoolStack here since it hasn't changed and probably is broken anyway, that's my next task - but the issue here was to get a working stateful allocator):

template <class T> struct MemPool
{
    typedef T value_type;
    shared_ptr<MemPoolStack> Stack; //My allocator's state goes here!
    template <class U> MemPool (const MemPool<U>& p) throw()
    {
        if(p.Stack->Mult!=sizeof(U))
        {
            throw runtime_error("Can't rebind MemPool to another size object. Sorry.");
        }
        Stack = p.Stack; //interestingly, this constructor is used by std::map but not std::list
    }
    template <class U> struct rebind { typedef MemPool<U> other; }; //off-topic: maybe they fixed this one in VS2019?
    MemPool(size_t count) :
        Stack(new MemPoolStack(count, sizeof(T))) //I can allocate the memory here!
    {
    }
    T* allocate (size_t n) 
    {
        //Now I can return Stack->Pop() here instead of using malloc!
        if(n!=1) throw runtime_error("MemPool can only allocate one item at a time. Sorry.");
        return static_cast<T*>(Stack->Pop());
        //return static_cast<T*>(malloc(n*sizeof(T)));  
    }
    void deallocate (T* p, size_t n) 
    { 
        ///Can use Stack->Push() here instead of free!
        if(n!=1) throw runtime_error("MemPool can only deallocate one item at a time. Sorry.");
        Stack->Push(static_cast<void*>(p));
        //free(p);
    }
};

template <class T, class U>
bool operator== (const MemPool<T>&, const MemPool<U>&) throw()
{return true;}

template <class T, class U>
bool operator!= (const MemPool<T>&, const MemPool<U>&) throw()
{return false;}

However, my instantiation of all this is now a little more long-winded:

typedef pair<size_t, typename list<TKey>::iterator> MapType;
typedef MemPool<_Tree_node<pair<TKey,MapType>,void*>> MapPoolType;
typedef MemPool<_List_node<TKey,void*>> ListPoolType;

list<TKey, ListPoolType> Keys(ListPoolType(capacity+10));
map<TKey, MapType, less<TKey>, MapPoolType> Map(MapPoolType(capacity+10));
//I use list/map capacity+10 because the containers like a few free nodes to themselves.
//Probably should investigate further as to what these numbers actually need to be.

Setting a breakpoint in MemPool::allocate() shows that the Stack member is now always populated.

Great, Hooray for C++11!

dexy
  • 293
  • 2
  • 10