basic question on std::vector in C++

Question

C++ textbooks, and threads, like these say that vector elements are physically contiguous in memory.

But when we do operations like v.push_back(3.14) I would assume the STL is using the new operator to get more memory to store the new element 3.14 just introduced into the vector.

Now say the vector of size 4 is stored in computer memory cells labelled 0x7, 0x8, 0x9, 0xA. If cell 0xB contains some other unrelated data, how will 3.14 go into this cell? Does that mean cell 0xB will be copied somewhere else, erased to make room for 3.14?

http://en.wikipedia.org/wiki/Vector_(C%2B%2B)#Capacity_and_reallocation — Laserallan, Sep 08 '11 at 22:47

score 10 · Answer 1 · edited Sep 09 '11 at 18:41

The short answer is that the entire array holding the vector's data is moved around to a location where it has space to grow. The vector class reserves a larger array than is technically required to hold the number of elements in the vector. For example:

vector< int > vec;
for( int i = 0; i < 100; i++ )
    vec.push_back( i );

cout << vec.size(); // prints "100"
cout << vec.capacity(); // prints some value greater than or equal to 100

The capacity() method returns the size of the array that the vector has reserved, while the size() method returns the number of elements in the array which are actually in use. capacity() will always return a number larger than or equal to size(). You can change the size of the backing array by using the reserve() method:

vec.reserve( 400 );
cout << vec.capacity(); // returns "400"

Note that size(), capacity(), reserve(), and all related methods refer to individual instances of the type that the vector is holding. For example, if vec's type parameter T is a struct that takes 10 bytes of memory, then vec.capacity() returning 400 means that the vector actually has 4000 bytes of memory reserved (400 x 10 = 4000).

So what happens if more elements are added to the vector than it has capacity for? In that case, the vector allocates a new backing array (generally twice the size of the old array), copies the old array to the new array, and then frees the old array. In pseudo-code:

if(capacity() < size() + items_added)
{
    size_t sz = capacity();
    while(sz < size() + items_added) 
       sz*=2;
    T* new_data = new T[sz]; 
    for( int i = 0; i < size(); i++ )
        new_data[ i ] = old_data[ i ];
    delete[] old_data;
    old_data = new_data;
}

So the entire data store is moved to a new location in memory that has enough space to store the current data plus a number of new elements. Some vectors may also dynamically decrease the size of their backing array if they have far more space allocated than is actually required.

In pseudo-code did you mean if( capacity() < size() + items_added) { new_data = new T[ capacity() * 2 ]; .. } ? — Ajeet Ganga, Sep 09 '11 at 02:50
@Ajeet - No, because the while loop works even for arbitrarily large numbers of items_added. For example, consider the case where capacity() = 10 and items_added = 12000. A simple if would lead to an array overrun. — Chris Vig, Sep 09 '11 at 03:29
@Chris Vig : Shouldn't it be then size_t sz = capacity(); while(sz — Ajeet Ganga, Sep 09 '11 at 03:37
Ajeet - yes, fair point. What I posted is far from an optimized algorithm, I was just trying to give an example of what could be going on behind the scenes. I agree your implementation is better. — Chris Vig, Sep 09 '11 at 03:42

score 8 · Accepted Answer · answered Sep 08 '11 at 22:46

8

std::vector first allocates a bigger buffer, then copies existing elements from the "old" buffer to the "new" buffer, then it deletes the "old buffer", and finally adds the new element into the "new" buffer.

Generally, std::vector implementation grow their internal buffer by doubling the capacity each time it's necessary to allocate a bigger buffer.

As Chris mentioned, every time the buffer grows, all existing iterators are invalidated.

answered Sep 08 '11 at 22:46

Gregory Pakosz

69,011
20
139
164

5

In addition, all existing iterators are invalidated. :-) – C. K. Young Sep 08 '11 at 22:47
3

But the allocation only happens when the capacity is exceeded. You can use `reserve` to define a specific capacity. If you never exceed that capacity, then no reallocation or iterator invalidation will occur. – Nicol Bolas Sep 09 '11 at 00:20

score 5 · Answer 3 · answered Sep 08 '11 at 22:47

When std::vector allocates memory for the values, it allocates more than it needs; you can find out how much by calling capacity. When that capacity is used up, it allocates a bigger chunk, again larger than it needs, and copies everything from the old memory to the new; then it releases the old memory.

yhager · Answer 4 · 2011-09-12T23:48:52.777

1

If there is not enough space to add the new element, more space will be allocated (as you correctly indicated), and the old data will be copied to the new location. So cell 0xB will still contain the old value (as it might have pointers to it in other places, it is impossible to move it without causing havoc), but the whole vector in question will be moved to the new location.

edited Sep 12 '11 at 23:48

answered Sep 08 '11 at 22:48

yhager

1,632
15
16

2

While this is true, you have completely failed to understand the OP's question. – Karl Knechtel Sep 08 '11 at 23:25
What I meant is that it will be realloc'ed to somewhere else, where these is enoungh memory for the vector. What did I miss? – yhager Sep 08 '11 at 23:30
1

You missed the part where it's reallocated somewhere else, and the part where the data is copied to the new allocation and pointers re-set accordingly. I.e., the part that explains why it doesn't matter what's in "slot 0xB" in the first place. I.e., the part that actually answers the question. – Karl Knechtel Sep 08 '11 at 23:55
Oh, I assumed it will be clear from "more space will be allocated" - It doesn't make any sense otherwise. Maybe I should have been more elaborate. – yhager Sep 09 '11 at 00:11
1

OP already mentioned an assumption that 'new' is being used, so if it were clear, the OP wouldn't have had a question in the first place. – Karl Knechtel Sep 09 '11 at 01:14
@yhager: You could edit that in now, to make your answer good. – Lightness Races in Orbit Sep 09 '11 at 19:00
@Karl, I edited the answer, hopefully it does a better job explaining what happens. Thanks for your constructive comments. – yhager Sep 12 '11 at 23:50

Mateen Ulhaq · Answer 5 · 2011-09-08T22:55:22.427

In C++, which comes from C, memory is not 'managed' the way you describe - Cell 0x0B's contents will not be moved around. If you did that, any existing pointers would be made invalid! (The only way this could be possible is if the language had no pointers and used only references for similar functionality.)

std::vector allocates a new, larger buffer and stores the value of 3.14 to the "end" of the buffer.

Usually, though, for optimized this->push_back()s, a std::vector allocates memory about twice its this->size(). This ensures that a reasonable amount of memory is exchanged for performance. So, it is not guaranteed 3.14 will cause a this->resize(), and may simply be put into this->buffer[this->size()++] if and only if this->size() < this->capacity().

score 0 · Answer 6 · answered Sep 08 '11 at 22:50

0

A vector is an array of memory. Typical implementation is that it grabs more memory than is required. It that footprint needs to expand over any other memory - the whole lot is copied. The old stuff is freed. The vector memory is on the stack - and that should be noted. It is also a good idea to say the maximum size is required.

answered Sep 08 '11 at 22:50

Ed Heal

59,252
17
87
127

The vector memory is ... what? – Lightness Races in Orbit Sep 09 '11 at 18:59

basic question on std::vector in C++

6 Answers6