std::vector capacity smart implementation

Question

I know that std::vector capacity behavior is implementation specific, is there any smart implementation that does this :

vector<int> v;
for(int i = 0; i < 10000 ; ++i){
    v.push_back(i);
}

At initialisation, it can predict the capacity of the 'vector', in this example it will initiate the capacity to 10000

I am asking for this because I always thought gcc does this kind of predictions, but I couldn't find anything about this ... I think I have seen this somewhere, so is there any implementation that does this ?

There's nothing like that in any mainstream implementation, and it wouldn't *necessarily* be desirable in the general case, as normally if it throws `bad_alloc` during resize you end up with a partially populated `v`, while with your proposal you've have nothing at all. Similar issues with more complex types that can throw during copying etc.. — Tony Delroy, Aug 25 '15 at 07:04
You should predict it manually... you can call a "reserve" to increase the capacity, so the vector won't copy the elems X times. — Melkon, Aug 25 '15 at 07:08
@Melkon I know I can predict it manually, but the compiler also can predict it staticly — Othman Benchekroun, Aug 25 '15 at 07:18
In this trivial case yes, it could predict, but in a real code you will probably never write something like this, so i guess it's not a priority for the compiler developers to optimize things like this. — Melkon, Aug 25 '15 at 07:19
@Melkon I know it wouldn't be possible in most cases, but in a case like this one, it would avoid many copies ... At least it could try, if it didn't predict it well, then it still can resize — Othman Benchekroun, Aug 25 '15 at 07:23
I don't know, probably any kind of optimisation slows down compilation, maybe they just don't want to slow down compilation by trying to optimise things which is almost never happen. I think it's the programmers job to see how big the vector will be in a case like this. I don't know the exact reasons, but it's really a never happen case, you mostly iterate over containers and stuffs, in real world you don't know how much iteration you need 99,9% of times. — Melkon, Aug 25 '15 at 07:27
@Melkon you are probably right. It's just that I was pretty sure the compiler does this, and I still don't know why — Othman Benchekroun, Aug 25 '15 at 07:31
@OthmanBenchekroun *"what resize?"* - if you `push_back` large numbers of elements without any prior `reserve()`, `vector` will `resize()` each time `size()` grows beyond `capacity()`. That behaviour has observable consequences, including in whether the allocations themselves might `throw`. Any of these many behavioural differences precludes optimisation under the "as if" rule, meaning the Standard would need to explicitly grant permission for such an optimisation. It doesn't. — Tony Delroy, Aug 25 '15 at 09:52
If the compiler does what I asked for, it wouldn't need to resize. This is exactly the point of my question — Othman Benchekroun, Aug 25 '15 at 13:27

score 3 · Answer 1 · answered Aug 25 '15 at 07:35

3

Nothing get predicted. However:

one can use reserve to preallocate the maximum required amount of elements. push_back will then never need to reallocate.
push_back use the growth strategy of vector that allocate more than just one mor element. IIRC the growth factor is 2, which means that the number of reallocation in a serie of push_back tends to become logarithmic. Therefore, the cost of N calls to push_back converges toward log2(N).

answered Aug 25 '15 at 07:35

Joel Falcou

6,247
1
17
34

1

The original growth factor of 2 was later described as a bad mistake causing memory overuse. – curiousguy Aug 25 '15 at 07:48
@curiousguy: by whom? one person's memory overuse is another's poor performance... as a necessary compromise, 2 is at least in the right ballpark. – Tony Delroy Aug 25 '15 at 09:49
1

Asymptotically the cost of push_back is `log2` however that keeps into account unlimited memory. WIth limited memory (4 or 8 GB in example) you can grow by a minor factor and at same time save some memory without decreasing performance by too much. The perfect way is reserving the space in advance – CoffeDeveloper Aug 25 '15 at 13:56
@TonyD by those who count: committee members, implementers. 2 growth means that at worse you have 100% overhead in your vector. – curiousguy Aug 25 '15 at 18:51
@curiousguy: I'd be interested in the discussion - any links/references? (Seems to me that if there was any kind of consensus amongst the "people who count" that 2 was not "good", it wouldn't be common in implementations - it's not actually specified in the Standard so can be changed freely by implementations at any time, as long as they don't pick a resizing scheme that invalidates the complexity requirements....) – Tony Delroy Aug 26 '15 at 04:49
@TonyD Growth factor was specified in the std in a very particular case. Other than that, most std STL implementations were at the time copies of SGI STL with few modifications. – curiousguy Aug 26 '15 at 05:24
FWIW, I remember (but cant find it) a blog post showing that x2 growth factor inhibit some modern OS into recycling memory efficiently as after N allocation of 1 2 4 .. 2^N bytes, then deallocating then allocating 2^N+1 can't fit into the free space of the 1 ... 2^N now deallocated memory block. – Joel Falcou Aug 28 '15 at 09:44

blashser · Answer 2 · 2015-08-25T07:30:47.657

-4

It exists different constructor for std::vector. One of these possibilities is to say the default value and the number of values that you want to your vector.

From the documentation of std::vector:

// constructors used in the same order as described above:
std::vector<int> first;                                // empty vector of ints
std::vector<int> second (4,100);                       // four ints with value 100
std::vector<int> third (second.begin(),second.end());  // iterating through second
std::vector<int> fourth (third);                       // a copy of third

This is useful if you know in advance the maximum size of your vector.

edited Aug 25 '15 at 07:30

answered Aug 25 '15 at 07:19

blashser

921
6
13

How is this an answer to my question ? – Othman Benchekroun Aug 25 '15 at 07:20
@OthmanBenchekroun You want to reserve the capacity of your vector in the moment that you instance it. You can do it like this: `std::vector var(default_value,number_of_times);` . I understood that this was your question. I copied the different constructors of the stl documentation to give a wide idea of the differents possibilities. – blashser Aug 25 '15 at 07:25
@blashser that constructor does not reserve capacity only, it creates a vector with 4 ints ... Anyway, that is not even my question, I am looking for an implementation that can predict the vector's capacity – Othman Benchekroun Aug 25 '15 at 07:28
@OthmanBenchekroun Ok. Sorry, I understood wrongly. Until I know, it is not specified in any standard. There is at least a previous similar question `http://stackoverflow.com/questions/12271017/initial-capacity-of-vector-in-c` – blashser Aug 25 '15 at 07:36
@blashser There is one standard, and I know it is implementation specific, that is why I am asking if anyone knows an implementation that does this – Othman Benchekroun Aug 25 '15 at 07:53
@OthmanBenchekroun what I don't get is how you come up with the idea, that something like *the compiler predicting* how much memory a vector is going to use can be implemented in the vector itself...You mean something like "compiler instructions/hints" in the implementation of the vector? Anyway, I must be misunderstanding your question, since this doesn't make any sense to me... – dingalapadum Aug 25 '15 at 08:43
@dingalapadum you are understanding it well, I am not talking about the vector's implementation, but the way the compiler implements the vector's capacity, since it is implementation specified. – Othman Benchekroun Aug 25 '15 at 09:04

std::vector capacity smart implementation

2 Answers2