Is it more efficient to preallocate a vector?

Question

In C++ Primer fourth edition, by Stanley B.Lippman, Josee Lajoie and Barbara E. Moo it states:

Because vectors grow efficiently, it is usually best to let the vector grow by adding elements to it dynamically as the element values are known.

and

Readers accustomed to using c or java might expect that because vector elements are stored contiguously, it would be best to preallocate the vector at its expected size. In fact the contrary is the case...

and

Allthough we can preallocate a given number of elements in a vector, it is usually more efficient to define an empty vector and add elements to it.

Assuming this is correct (the authors are as reputable as they come, one is a co-author of C++ itself) then can anyone give me a case that proves this statement, and explain why?

In many cases it is more efficient. But the real questions should be: "by how much?" and "what is the price we pay for that?". — Victor Yarema, Mar 12 '23 at 19:47

GManNickG · Answer 1 · 2012-08-09T17:07:28.523

42

It depends.

If you don't know what the final size will be, then let the vector allocate using its allocation scheme (usually doubles each time, or somewhere around there). This way you avoid reallocating for every single element:

std::vector<int> v;

// good:
for (/* populate v */) // unknown number of iterations
{
    v.push_back(i); // possible reallocation, but not often
}

// bad:
for (/* populate v */) // unknown number of iterations
{
    v.reserve(v.size() + 1); // definite reallocation, every time
    v.push_back(i); // (no reallocation)
}

But if you know ahead of time you won't be reallocating, then preallocate:

std::vector<int> v;

// good:
v.reserve(10); 
for (/* populate v */) // only 10 iterations (for example)
{
    v.push_back(i); // no reallocations
}

// not bad, but not the best:
for (/* populate v */) // only 10 iterations (for example)
{
    v.push_back(i); // possible reallocation, but not often (but more than needed!)
}

edited Aug 09 '12 at 17:07

answered Aug 09 '12 at 17:02

GManNickG

494,350
52
494
543

1

I'll add some more context to my original question, but I think it implies that even if the size is known up front, it is usually more efficient to let the vector grow. – dangerousdave Aug 09 '12 at 17:08
1

@dangerousdave: Okay, but if you know the size up front, just allocate once and you're done. It truly isn't better to let it grow, which will allocate unnecessarily often. – GManNickG Aug 09 '12 at 17:11
2

Most of the time the difference will be small enough that nobody will notice. Then it is more efficient to write less code. – Bo Persson Aug 09 '12 at 17:18
6

@BoPersson: That's just not true, I think. Memory allocation generally isn't lightweight, and the work done by reallocating *does* add up, I've seen it in my own applications. I am of course assuming we're talking "real" amounts of data, not tiny sandbox vectors. – GManNickG Aug 09 '12 at 17:20
If and when you notice a difference, do something about it. Nobody will notice whether you do a `v.reserve(10)`, or not. So don't. – Bo Persson Aug 09 '12 at 17:22
11

@BoPersson: I'm part of the "don't pre-maturely optimize" crowd too, but the *reason* I'm in that crowd is because people often waste time trying to make optimizations that aren't even in the right spots. If making those optimizations took no time, then I wouldn't stop them. This optimization takes no time, and I would expect any competent programmer to preallocate a buffer to it's required size when possible to do so, it's literally a single line of code. This is like choosing the right sort implementation, a regular design decision, not like trying to optimize `i++`. – GManNickG Aug 09 '12 at 17:30

Akavall · Accepted Answer · 2017-05-22T16:43:38.583

I timed this simple example:

#include<iostream>
#include<vector>

int main() {

    int limit = 100 * 1000 * 1000;
    std::vector<long> my_vec;
    my_vec.reserve(limit); // comment out this line to not preallocate

    for (int i=0; i < limit; i++) {
        my_vec.push_back(i);
    }

    long my_sum = 0;
    for (int i=0; i < limit; i++) {
        my_sum += my_vec[i];
    }

    std::cout << my_sum << std::endl;
    return 0;
}

Complied with:

g++ -std=c++11 -O2 my_file.cpp -o my_exec

And found the difference to be substantial:

Without preallocation:

real    0m3.366s
user    0m1.656s
sys     0m1.660s

With preallocation:

real    0m1.688s
user    0m0.732s
sys     0m0.936s

My conclusion here is: If building a vector is a big part of the program, then preallocating for efficiency makes sense. However, building a larger vector over and over is unlikely, and thus it is rarely a bottle neck. However, using reserve() has other advantages besides preallocating.

Bjarne Stroustrup in The C++ programming language (4th addition) has this to say:

I used to be careful about using reserve() when I was reading into a vector. I was surprised to find that for essentially all my uses, calling reserve() did not measurably affect performance. The default growth strategy worked just as well as my estimates, so I stopped trying to improve performance using reserve(). Instead I use it to increase predictability of reallocation delays and to prevent invalidation of pointers and iterators.

score 7 · Answer 3 · answered Aug 09 '12 at 16:57

7

It can be. It depends a lot on what the elements are, how much work it is to copy or construct them, and how many there are.

If you preallocate a vector you will end up calling the default constructor for each element to make empty elements, and then copying the item over the space later. If you add elements it can just copy or construct the element in place which may be more efficient.

answered Aug 09 '12 at 16:57

jcoder

29,554
19
87
130

17

Preallocating doesn't entail default constructing. `v.reserve(n); // constructs nothing` – GManNickG Aug 09 '12 at 17:02
1

However, if you use `reserve`, it doesn't do any of the construction up front, and only the allocation. Which _might_ be even better. – Dave S Aug 09 '12 at 17:03
Perhaps preallocating *does* mean default construction in the quote? That would make it more obviously true. – Bo Persson Aug 09 '12 at 17:15
1

@GManNickG: I think that the quote, since it says "preallocating elements" and not "preallocating space for elements", it is referring to `resize` (or the constructor that takes a size), rather than `reserve`. – Benjamin Lindley Aug 09 '12 at 17:16
1

A fair bit of semantic ambiguity there, so I could see that. I think more important is what *actually* should be done, though, and not caring too much for what the author's words literally mean. – GManNickG Aug 09 '12 at 17:18
Thus is why I said it can be. It depends on exactly what you do and your classes a lot. Reserve and push_back is often the best choice – jcoder Aug 09 '12 at 17:37
1

to clarify, by "preallocate", given the context, I believe it is referring to vector svec(10,"hi") – dangerousdave Aug 10 '12 at 09:11

Is it more efficient to preallocate a vector?

3 Answers3

Linked