Using vector as a buffer without initializing it on resize()

Question

I want to use vector<char> as a buffer. The interface is perfect for my needs, but there's a performance penalty when resizing it beyond its current size, since the memory is initialized. I don't need the initialization, since the data will be overwritten in any case by some third-party C functions. Is there a way or a specific allocator to avoid the initialization step? Note that I do want to use resize(), not other tricks like reserve() and capacity(), because I need size() to always represent the significative size of my "buffer" at any moment, while capacity() might be greater than its size after a resize(), so, again, I cannot rely on capacity() as a significative information for my application. Furthemore, the (new) size of the vector is never known in advance, so I cannot use std::array. If vector cannot be configured that way, I'd like to know what kind of container or allocator I could use instead of vector<char, std::alloc>. The only requirement is that the alternative to vector must at most be based on STL or Boost. I have access to C++11.

`reserve`/`capacity`? Or `std::array` , or an appropriate constructor..Not very clear what you want to do/avoid.. — Karthik T, Mar 05 '13 at 09:22
Do you know size at compile time? If yes, you can use std::array — Marcin Deptuła, Mar 05 '13 at 09:22
@UmNyobe no need to measure. It's clear that resizing from 0 to 10^9 as a worst case costs in terms of performance, even if the allocator is smart enough to use memset (which cannot be guaranteed). — Martin, Mar 05 '13 at 09:28
"I need size() to give back the real size of the buffer at any moment." That's a very artificial requirement and I would put some thought into whether this is really necessary. `capacity()` does what you want and it's **not** a 'trick'. To me it seems that the only reason you dislike it is because it isn't called `size`. — us2012, Mar 05 '13 at 09:28
Pretty sure he means the **`T()`** element initializer is causing grief considering he's about to roll over all that data anyway. That about sum it up, @Martin? — WhozCraig, Mar 05 '13 at 09:32
@us2012 there is not guarantee that capacity() will be *the same as* size. — Martin, Mar 05 '13 at 09:35
@us2012: Using a vector's reserved storage is not a good idea. If the size of the vector's internal storage changes, the value of the 'fake' elements won't get copied, vector implementations with checked iterators/`operator[]` will assert if you use that funcationality to access the 'fake' elements etc. — JoeG, Mar 05 '13 at 09:39
Hm, okay, in that case I think I didn't understand the problem correctly. My apologies. — us2012, Mar 05 '13 at 09:40
@Martin: There's nothing in the standard library that you can use, and nothing I know of in boost either. You're pretty much stuck with a wrapper around a dynamically allocated array. — JoeG, Mar 05 '13 at 09:41
I agree with Joe. Plus If you want third party C functions to access the data buffer, I don't think vector is a right fit. You need a good old `char*` and manage it yourself. — UmNyobe, Mar 05 '13 at 09:44
@UmNyobe or a `std::unique_ptr` or some other smart-pointer class. However, he does lose his `size()` and other vector methods upon doing this, so he would need a containment wrapper of some sort. Depending on how it is used, he'd likely also need an iterator class as well. — WhozCraig, Mar 05 '13 at 09:49
@Martin back to your original question before the hiatus into member discussion. I think I see what you ultimately want. The vector<> constructor on my platform, for example, does this: `allocate(__n);__construct_at_end(__n);`. I think you want just the alloc, and no construct, which I brought up earlier. Is that correct? Obviously this is implementation specific in this sample case. If this is the case, I'm afraid your SOL. Unless I'm mistaken the standard *mandates* intialization of members in the vector<> class sequence, and in your case it would be value-initialization. — WhozCraig, Mar 05 '13 at 09:51
@WhozCraig yes, correct, only the allocation would be necessary. — Martin, Mar 05 '13 at 09:57
Not a duplicate, but related: http://stackoverflow.com/questions/11626299/how-to-exactly-simulate-new-tn-with-an-allocator — Luc Touraille, Mar 05 '13 at 09:57
@Martin That being the case, Joe brings up a good point about the possibility that a no-op construct/destroy method will likely get thrown out by a decent optimizer, and may be worth pursuing on your platform. The list of alternatives he presents in his answer is pretty sound. Consider them. — WhozCraig, Mar 05 '13 at 09:59
+1 But beware of the narrow-minded *"premature optimization"*-paranoids that won't like this reasonable and legitimate question so much. ;) — Christian Rau, Mar 05 '13 at 15:12
Possible duplicate of [Resizing a C++ std::vector without initializing data](http://stackoverflow.com/questions/7689406/resizing-a-c-stdvectorchar-without-initializing-data) — Ruslan, Jul 01 '16 at 18:15

Maxim Egorushkin · Answer 1 · 2019-03-11T17:22:50.080

36

It is a known issue that initialization can not be turned off even explicitly for std::vector.

People normally implement their own pod_vector<> that does not do any initialization of the elements.

Another way is to create a type which is layout-compatible with char, whose constructor does nothing:

struct NoInitChar
{
    char value;
    NoInitChar() noexcept {
        // do nothing
        static_assert(sizeof *this == sizeof value, "invalid size");
        static_assert(__alignof *this == __alignof value, "invalid alignment");
    }
};

int main() {
    std::vector<NoInitChar> v;
    v.resize(10); // calls NoInitChar() which does not initialize

    // Look ma, no reinterpret_cast<>!
    char* beg = &v.front().value;
    char* end = beg + v.size();
}

edited Mar 11 '19 at 17:22

answered Mar 05 '13 at 10:00

Maxim Egorushkin

131,725
17
180
271

Oh nice! I did not think about the char-compatible type with an empty constructor. – Martin Mar 05 '13 at 10:09
Can I consider those static_assertion's always true by standard? If I am not wrong, the standard does not mandate "to pack" that POD (with one char data member) to a char.. – Martin Mar 05 '13 at 10:27
3

@Martin the standard requires that a) the alignment of an aggregate is a multiple of the alignment of one of its members with the largest alignment requirement, b) the size of a structure is a multiple of its alignment. In other words, the standard guarantees that `NoInitChar` has the same size and alignment as `char`. Those static asserts are primarily compilable documentation. – Maxim Egorushkin Mar 05 '13 at 10:30
4

@MaximYegorushkin Really? Sounds rather that it just guarantees that its size and alignment are *mutliples* of the `char` size and alignment (whatever a compiler could gain from adding unneccessary padding, but well, the standard at least allows it). So those asserts are not true by standard, even if probably true on any reasonable platform. But interesting answer +1. – Christian Rau Mar 05 '13 at 15:27
@ChristianRau You are right that it does not have to use x1 multiplier even it would be enough. However, apart from members there can only be padding in the struct, but compilers only add padding where it is necessary, IIRC, so the compiler would not make your structures larger than they should be (as long as a) and b) satisfied). Also, making structures larger than necessary would violate the principle of least surprise and probably won't help sell the compiler or keep the users happy. – Maxim Egorushkin Mar 05 '13 at 15:30
@ChristianRau Thinking more about it, there are three types of struct padding: at the beginning, between members and at the end. The standard prohibits any padding at the beginning (the pointer to a structure equals to the pointer to its first member). There is no padding between members in this case since there is only one member. There could only be trailing padding to ensure that elements of `NoInitChar` array are sufficiently aligned. In this case no trailing padding is necessary. – Maxim Egorushkin Mar 05 '13 at 16:32
1

@MaximYegorushkin Yes, *"neccessary"* or *"allowed"*? That's the question (however theoretical that may be). – Christian Rau Mar 05 '13 at 16:39
@ChristianRau The standard does not require that there is no trailing padding if it is not required. Don't know why, it may just be under-specification we've seen before (like the requirement that `std::string` buffer is contiguous). – Maxim Egorushkin Mar 05 '13 at 16:45
1

Interestingly, this technique is only well-defined since C++11. In C++03 `std::vector::vector` and `std::vector::resize` take default parameter `const T& value=T()`, which means that the uninitialized value would be copied over the whole vector, thus leading to UB due to read of uninitialized variable. – Ruslan Aug 23 '16 at 12:54
@Ruslan `char` types are required to have no padding bits and no signaling representation. Therefore reading an uninitialized `char` and writing it is not UB, IIRC. – Maxim Egorushkin Aug 23 '16 at 12:57
1

_4.1/1 Lvalue-to-rvalue conversion_ says: _If the object to which the lvalue refers <...> is uninitialized, a program that necessitates this conversion has undefined behavior._ I believe this conversion is necessary for any copy operation regardless of type and its traits. – Ruslan Aug 23 '16 at 13:44
The `NoInitChar` trick did not work for me. Indeed, in Visual Studio 2015, it was even slower than using a `std::vector`. I assume this trick is highly compiler/library dependent, and you can not rely on it. – user2328447 Dec 28 '17 at 12:04
1

@user2328447 `NoInitChar` constructor must be inline, otherwise you get poor performance, as you observed. – Maxim Egorushkin Dec 28 '17 at 12:18
@MaximEgorushkin Maybe edit `inline` into the answer? – Keith M Mar 11 '19 at 17:12
1

@KeithM https://en.cppreference.com/w/cpp/language/inline: _A function defined entirely inside a class/struct/union definition, whether it's a member function or a non-member friend function, is implicitly an inline function._. People should benchmark optimised versions – Maxim Egorushkin Mar 11 '19 at 17:22

JoeG · Accepted Answer · 2013-03-05T13:34:05.320

21

There's nothing in the standard library that meets your requirements, and nothing I know of in boost either.

There are three reasonable options I can think of:

Stick with std::vector for now, leave a comment in the code and come back to it if this ever causes a bottleneck in your application.
Use a custom allocator with empty construct/destroy methods - and hope your optimiser will be smart enough to remove any calls to them.
Create a wrapper around a a dynamically allocated array, implementing only the minimal functionality that you require.

edited Mar 05 '13 at 13:34

answered Mar 05 '13 at 09:51

JoeG

12,994
1
38
63

2

+1 The second item is a *big* hope, but it will probably see the light if the optimizer is reasonably intelligent. The third obviously grants the most control, but at the price of more code to maintain (but would it really be worse than a custom allocator derivation?, probably not.) – WhozCraig Mar 05 '13 at 09:57
6

@WhozCraig: I would throw away any compiler that would failed to remove calls to empty functions, this is a really straight-forward optimization. I think the custom allocator is the cleanest and simplest way to go. – Luc Touraille Mar 05 '13 at 10:01
I think I can derive the standard allocator privately and export all the member functions from that allocator (via using) except construct() and destroy(), which I will redefine as empty. The g++/clang++ compilers should be smart enough to optimize the calls out. – Martin Mar 05 '13 at 10:06
@Martin it will be interesting to see what comes of it. Don't forget to provide a `rebind`. If you've never written a `std::allocator` compliant allocator before, you'll eventually get to that. – WhozCraig Mar 05 '13 at 10:09
@LucTouraille likewise, but I've seen stranger things. You couldn't get me to the bug-filer fast enough if it didn't throw it out. – WhozCraig Mar 05 '13 at 10:10
Method 2 should work, but it won't with gcc (4.7.2), I am afraid. GNU `std::vector` does not use the allocator to initialize the elements. – Maxim Egorushkin Mar 05 '13 at 10:23
@MaximYegorushkin how is that possible? Does not it have to use the allocator to constuct the elements by standard? – Martin Mar 05 '13 at 10:30
@Martin it probably has to, but I just stepped through `std::vector::resize` and it uses `std::_Construct` to initialize new elements, not the allocator. – Maxim Egorushkin Mar 05 '13 at 10:33
3

@MaximYegorushkin, for trivial types it uses `_Construct` if `std::allocator` is used, because the result is "as if" each element had been initialized by `std::allocator::construct`, but for a custom allocator that optimization is not used. See `` and the `__uninitialized_fill_n_a` function which is overloaded on the allocator type and either dispatches to `__uninitialized_fill_n` or `allocator_traits::construct` – Jonathan Wakely Mar 05 '13 at 14:15
@JonathanWakely thanks Jon, I was hoping someone of gcc fame would correct me. – Maxim Egorushkin Mar 05 '13 at 14:24
1

The `default_init_constructor` of [this answer](http://stackoverflow.com/a/21028912/651937) implements an allocator whose `construct` method default initializes if no arguments are given, i.e., it implements solution 2 proposed by Joe Gauderin. – ingomueller.net Sep 22 '15 at 15:17

Pavel P · Answer 3 · 2019-07-16T11:10:30.770

As an alternative solution that works with vectors of different pod types:

template<typename V>
void resize(V& v, size_t newSize)
{
    struct vt { typename V::value_type v; vt() {}};
    static_assert(sizeof(vt[10]) == sizeof(typename V::value_type[10]), "alignment error");
    typedef std::vector<vt, typename std::allocator_traits<typename V::allocator_type>::template rebind_alloc<vt>> V2;
    reinterpret_cast<V2&>(v).resize(newSize);
}

And then you can:

std::vector<char> v;
resize(v, 1000); // instead of v.resize(1000);

This is most likely UB, even though it works properly for me for cases where I care more about performance. Difference in generated assembly as produced by clang:

test():
        push    rbx
        mov     edi, 1000
        call    operator new(unsigned long)
        mov     rbx, rax
        mov     edx, 1000
        mov     rdi, rax
        xor     esi, esi
        call    memset
        mov     rdi, rbx
        pop     rbx
        jmp     operator delete(void*)

test_noinit():
        push    rax
        mov     edi, 1000
        call    operator new(unsigned long)
        mov     rdi, rax
        pop     rax
        jmp     operator delete(void*)

Three cheers for having to use dubious undefined behavior because of a glaring language oversight. I'd humbly recommend renaming the function to `resize_hack` or similar in recognition of the fact. — Roflcopter4, Oct 11 '21 at 09:49

score 3 · Answer 4 · answered Apr 21 '20 at 14:59

So to summarize the various solutions found on stackoverflow:

use a special default-init allocator. (https://stackoverflow.com/a/21028912/1984766)
drawback: changes the vector-type to std::vector<char, default_init_allocator<char>> vec;
use a wrapper-struct struct NoInitChar around a char that has an empty constructor and therefore skips the value-initialization (https://stackoverflow.com/a/15220853/1984766)
drawback: changes the vector-type to std::vector<NoInitChar> vec;
temporarily cast the vector<char> to vector<NoInitChar> and resize it (https://stackoverflow.com/a/57053750/1984766)
drawback: does not change the type of the vector but you need to call your_resize_function (vec, x) instead of vec.resize (x).

With this post I wanted to point out that all of these methods need to be optimized by the compiler in order to speed up your program. I can confirm that the initialization of the new chars when resizing is indeed optimized away with every compiler I tested. So everything looks good ...
But --> since method 1 & 2 change the type of the vector, what happens when you use these vectors under more "complex" circumstances.
Consider this example:

#include <time.h>
#include <vector>
#include <string_view>
#include <iostream>

//high precision-timer
double get_time () {
    struct timespec timespec;
    ::clock_gettime (CLOCK_MONOTONIC_RAW, &timespec);
    return timespec.tv_sec + timespec.tv_nsec / (1000.0 * 1000.0 * 1000.0);
}

//method 1 --> special allocator
//reformated to make it readable
template <typename T, typename A = std::allocator<T>>
class default_init_allocator : public A {
private:
    typedef std::allocator_traits<A> a_t;
public:
    template<typename U>
    struct rebind {
        using other = default_init_allocator<U, typename a_t::template rebind_alloc<U>>;
    };
    using A::A;

    template <typename U>
    void construct (U* ptr) noexcept (std::is_nothrow_default_constructible<U>::value) {
        ::new (static_cast<void*>(ptr)) U;
    }
    template <typename U, typename...Args>
    void construct (U* ptr, Args&&... args) {
        a_t::construct (static_cast<A&>(*this), ptr, std::forward<Args>(args)...);
    }
};

//method 2 --> wrapper struct
struct NoInitChar {
public:
    NoInitChar () noexcept { }
    NoInitChar (char c) noexcept : value (c) { }
public:
    char value;
};

//some work to waste time
template<typename T>
void do_something (T& vec, std::string_view str) {
    vec.push_back ('"');
    vec.insert (vec.end (), str.begin (), str.end ());
    vec.push_back ('"');
    vec.push_back (',');
}

int main (int argc, char** argv) {
    double timeBegin = get_time ();

    std::vector<char> vec;                                 //normal case
    //std::vector<char, default_init_allocator<char>> vec; //method 1
    //std::vector<NoInitChar> vec;                         //method 2
    vec.reserve (256 * 1024 * 1024);
    for (int i = 0; i < 1024 * 1024; ++i) {
        do_something (vec, "foobar1");
        do_something (vec, "foobar2");
        do_something (vec, "foobar3");
        do_something (vec, "foobar4");
        do_something (vec, "foobar5");
        do_something (vec, "foobar6");
        do_something (vec, "foobar7");
        do_something (vec, "foobar8");
        vec.resize (vec.size () + 64);
    }

    double timeEnd = get_time ();
    std::cout << (timeEnd - timeBegin) * 1000 << "ms" << std::endl;
    return 0;
}

You would expect that method 1 & 2 outperform the normal vector with every "recent" compiler since the resize is free and the other operations are the same. Well think again:

                g++ 7.5.0   g++ 8.4.0   g++ 9.3.0   clang++ 9.0.0
vector<char>         95ms       134ms       133ms            97ms
method 1            130ms       159ms       166ms            91ms
method 2            166ms       160ms       159ms            89ms

All test-applications are compiled like this and executed 50times taking the lowest measurement:

$(cc) -O3 -flto -std=c++17 sample.cpp

score 2 · Answer 5 · answered Mar 05 '13 at 09:51

2

Encapsulate it.

Initialise it to the maximum size (not reserve).

Keep a reference to the iterator representing the end of the real size, as you put it.

Use begin and real end, instead of end, for your algorithms.

answered Mar 05 '13 at 09:51

Peter Wood

23,859
5
60
99

score 2 · Answer 6 · answered Nov 29 '20 at 08:24

It's very rare that you would need to do this; I strongly encourage you to benchmark your situation to be absolutely sure this hack is needed.

Even then, I prefer the NoInitChar solution. (See Maxim's answer)

But if you're sure you would benefit from this, and NoInitChar doesn't work for you, and you're using clang, or gcc, or MSVC as your compiler, consider using folly's routine for this purpose.

See https://github.com/facebook/folly/blob/master/folly/memory/UninitializedMemoryHacks.h

The basic idea is that each of these library implementations has a routine for doing uninitialized resizing; you just need to call it.

While hacky, you can at least console yourself with knowing that facebook's C++ code relies on this hack working properly, so they'll be updating it if new versions of these library implementations require it.

Using vector as a buffer without initializing it on resize()

6 Answers6

Linked

Related