2

I want to use a vector with the custom allocator below, in which construct() and destroy() have an empty body:

struct MyAllocator : public std::allocator<char> {
    typedef allocator<char> Alloc;
    //void destroy(Alloc::pointer p) {} // pre-c+11
    //void construct(Alloc::pointer p, Alloc::const_reference val) {} // pre-c++11
    template< class U > void destroy(U* p) {}
    template< class U, class... Args > void construct(U* p, Args&&... args) {}
    template<typename U> struct rebind {typedef MyAllocator other;};
};

Now for the reasons I have specified in another question, the vector has to be resized several times in a loop. To simplify my tests on performance, I made a very simple loop like the following:

std::vector<char, MyAllocator> v;
v.reserve(1000000); // or more. Make sure there is always enough allocated memory
while (true) {
   v.resize(1000000);
   // sleep for 10 ms
   v.clear(); // or v.resize(0);
};

I noticed that changing the size that way the CPU consumption increases from 30% to 80%, despite the allocator has empty construct() and destroy() member functions. I would have expected a very minimal impact or no impact at all (with optimization enabled) on performance because of that. How is that consumption increment possible? A second question is: why when reading the memory after any resize, I see that the value of each char in the resized memory is 0 (I would expect some non-zero values, since constuct() does nothing) ?

My environment is g++4.7.0 , -O3 level optimization enabled. PC Intel dual core, 4GB of free memory. Apparently calls to construct could not be optimized out at all?

Community
  • 1
  • 1
Martin
  • 9,089
  • 11
  • 52
  • 87
  • Have you verified your `construct` function is being called? Also (or perhaps the problem...) you shouldn't inherit from `std::allocator` publically, it's not meant to be a base class. – GManNickG Mar 06 '13 at 01:51
  • Please provide an http://sscce.org that we can just copy, paste and run. – Xeo Mar 06 '13 at 02:02
  • 1
    @GManNickG, inheriting publically from `std::allocator` isn't the problem. `std::allocator` is stateless and empty and allocators are used by value not by pointer-to-base. In C++11 the minimal Allocator requirements are so simple that there's less benefit from inheriting from `std::allocator`, but it's still useful for C++03 compatibility (and most of GCC's containers still use the C++03 requirements) – Jonathan Wakely Mar 06 '13 at 10:48

2 Answers2

2

Updated

This is a complete rewrite. There was an error in the original post/my answer which made me benchmark the same allocator twice. Oops.

Well, I can see huge differences in performance. I have made the following test bed, which takes several precautions to ensure crucial stuff isn't completely optimized out. I then verified (with -O0 -fno-inline) that the allocator's construct and destruct calls are getting called the expected number of times (yes):

#include <vector>
#include <cstdlib>

template<typename T>
struct MyAllocator : public std::allocator<T> {
    typedef std::allocator<T> Alloc;
    //void destroy(Alloc::pointer p) {} // pre-c+11
    //void construct(Alloc::pointer p, Alloc::const_reference val) {} // pre-c++11
    template< class U > void destroy(U* p) {}
    template< class U, class... Args > void construct(U* p, Args&&... args) {}
    template<typename U> struct rebind {typedef MyAllocator other;};
};

int main()
{
    typedef char T;
#ifdef OWN_ALLOCATOR
    std::vector<T, MyAllocator<T> > v;
#else
    std::vector<T> v;
#endif
    volatile unsigned long long x = 0;
    v.reserve(1000000); // or more. Make sure there is always enough allocated memory
    for(auto i=0ul; i< 1<<18; i++) {
        v.resize(1000000);
        x += v[rand()%v.size()];//._x;
        v.clear(); // or v.resize(0);
    };
}

The timing difference is marked:

g++ -g -O3 -std=c++0x -I ~/custom/boost/ test.cpp -o test 

real    0m9.300s
user    0m9.289s
sys 0m0.000s

g++ -g -O3 -std=c++0x -DOWN_ALLOCATOR -I ~/custom/boost/ test.cpp -o test 

real    0m0.004s
user    0m0.000s
sys 0m0.000s

I can only assume that what you are seeing is related to the standard library optimizing allocator operations for char (it being a POD type).

The timings get even farther apart when you use

struct NonTrivial
{
    NonTrivial() { _x = 42; }
    virtual ~NonTrivial() {}
    char _x;
};

typedef NonTrivial T;

In this case, the default allocator takes in excess of 2 minutes (still running). whereas the 'dummy' MyAllocator spends ~0.006s. (Note that this invokes undefined behaviour referencing elements that haven't been properly initialized.)

sehe
  • 374,641
  • 47
  • 450
  • 633
  • "Takes almost exactly as long independent of whether OWN_ALLOCATOR is defined." - that repeats the observation in the question - namely that the custom allocator doesn't improve performance - without beginning to address why. – Tony Delroy Mar 06 '13 at 04:12
  • The `rebind` means that `vector` will always use `allocator` and not your own type, so you're benchmarking `std::allocator` versus `std::allocator`, no wonder it takes the same time – Jonathan Wakely Mar 06 '13 at 10:17
  • @JonathanWakely That was a silly error. I fixed my answer, also adding some more precautions to ensure validity of the comparisons (dead-code elmination under the as-if rule (the `volatile` accumulator and `NonTrivial`). – sehe Mar 06 '13 at 10:40
  • (did you mean to put the **rewrite pending** on the question too?) – Jonathan Wakely Mar 06 '13 at 10:49
  • @JonathanWakely ahahahaha - that took me a while to notice! So, **that's** where it went. Gee. I better stay low profile today – sehe Mar 06 '13 at 11:01
0

(With corrections thanks to GManNickG and Jonathan Wakely below)

In C++11, with the post-Standard correction proposed at http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3346.pdf, resize() will construct added elements using the custom allocator.

In earlier versions, resize() value initialises the elements added, which takes time.

These initialisation steps are nothing to do with memory allocation, it's what's done to the memory after it's allocated. Value initialisation is an unavoidable expense.

Given the state of C++11 Standards compliance in current compilers, it would be worth looking at your headers to see which approach is in use.

The value initialisation was sometimes unnecessary and inconvenient, but also protected a lot of programs from unintended mistakes. For example, someone might think they can resize a std::vector<std::string> to have 100 "uninitialised" strings, then start assigning into them before reading from them, but a prerequisite for the assignment operator is that the object being changed has been properly constructed... otherwise it'll likely find a garbage pointer and try to delete[] it. Only careful placement new-ing of each element can safely construct them. The API design errs on the side of robustness.

Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
  • 1
    It does this by calling the allocator's `construct`, no? – GManNickG Mar 06 '13 at 01:52
  • @GManNickG: in C++98, the full function prototype is `void resize (size_type n, value_type val = value_type());`... inside the function it doesn't know whether "val" was specified by the caller or picked up the default value... it just iterates over the elements copy-constructing each in turn. For example, in GCC 3.4.4's implementation, `resize()` calls `insert` which calls `_M_fill_insert` which calls `copy_backwards` then `fill`.... – Tony Delroy Mar 06 '13 at 02:01
  • @GManNickG: for C++11, from http://www.cplusplus.com/reference/vector/vector/resize/ - "If n is greater than the current container size, the content is expanded by inserting at the end as many elements as needed to reach a size of n. If val is specified, the new elements are initialized as copies of val, otherwise, they are value-initialized." – Tony Delroy Mar 06 '13 at 02:03
  • 2
    That site is incorrect. According the standard, when no value is specified (the single-argument overload of `resize` is used) then to make up for missing elements, they are *default inserted* into the container. Had the container used the default allocator, this would indeed mean value-initialized but that site fails to take into account custom allocators. The correct general form for default insertion is: `allocator_traits::construct(m, p);` where `m` is the allocator (and `A` is its type) and `p` is the location where a `T` will be constructed (`T` being the value type). – GManNickG Mar 06 '13 at 02:14
  • That is, it calls `construct` with just a pointer, no other constructor arguments. For his allocator, this means no-op. – GManNickG Mar 06 '13 at 02:14
  • @GManNickG: the site's correct for pre-C++11, and consistent with the C++11 final draft - n3242 - "Effects: If sz < size(), equivalent to erase(begin() + sz, end());. If size() < sz, appends sz - size() value-initialized elements to the sequence.". Are you using the final C++11 Standard? – Tony Delroy Mar 06 '13 at 02:19
  • Interesting. The post-C++11 draft I read is indeed different, though now it's not near me so I cannot tell you the date nor version. I suspect, though, that the new reading I've quoted will be the newest standard form. But you're right that current implementations cannot be blamed, as that's not what they're supposed to do! – GManNickG Mar 06 '13 at 03:11
  • I checked draft n3337 - which many websites say has only "minor editorial changes" from the final Standard - it also says "Effects: If sz <= size(), equivalent to erase(begin() + sz, end());. If size() < sz, appends sz - size() value-initialized elements to the sequence.". So perhaps the behaviour you document above is a post-C++11 Standard proposal...? – Tony Delroy Mar 06 '13 at 04:19
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/25644/discussion-between-gmannickg-and-tony-d) – GManNickG Mar 06 '13 at 05:16
  • 1
    See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3346.pdf which corrected the spec of `resize` to use the new _DefaultInsertable_ concept – Jonathan Wakely Mar 06 '13 at 10:29
  • @JonathanWakely: that's terrific... good to see the allocators becoming useful. Thanks GManNickG too (sorry - couldn't join you in chat - firewalled). – Tony Delroy Mar 07 '13 at 02:12