Amortized analysis of std::vector insertion

Question

How do we do the analysis of insertion at the back (push_back) in a std::vector? It's amortized time is O(1) per insertion. In particular in a video in channel9 by Stephan T Lavavej and in this ( 17:42 onwards ) he says that for optimal performance Microsoft's implementation of this method increases capacity of the vector by around 1.5.

How is this constant determined?

Are you sure you mean *insertion*? I think only insertion *at the end*, or `push_back`, is amortized O(1); arbitrary insertion is linear in the number of elements that need to be moved. — Kerrek SB, Jul 01 '11 at 16:33
oh, i had a doubt regarding that thanks for mentioning it...will edit it — jemmanuel, Jul 01 '11 at 16:40
Why on Earth are people voting to close as "off-topic" and "non-constructive"? Voting to close as "duplicate" could be understandable, but not the reasons given. Potential voter: when you DO NOT UNDERSTAND a question, please do refrain from voting. — Cheers and hth. - Alf, Jul 01 '11 at 17:26

score 20 · Accepted Answer · edited May 23 '17 at 12:25

20

Assuming you mean push_back and not insertion, I believe that the important part is the multiply by some constant (as opposed to grabbing N more elements each time) and as long as you do this you'll get amortized constant time. Changing the factor changes the average case and worst case performance.

Concretely: If your constant factor is too large, you'll have good average case performance, but bad worst case performance especially as the arrays get big. For instance, imagine doubling (2x) a 10000 size vector just because you have the 10001th element pushed. EDIT: As Michael Burr indirectly pointed out, the real cost here is probably that you'll grow your memory much larger than you need it to be. I would add to this that there are cache issues that affect speed if your factor is too large. Suffice it to say that there are real costs (memory and computation) if you grow much larger than you need.

However if your constant factor is too small, say (1.1x) then you're going to have good worst case performance, but bad average performance, because you're going to have to incur the cost of reallocating too many times.

Also, see Jon Skeet's answer to a similar question previously. (Thanks @Bo Persson)

A little more about the analysis: Say you have n items you are pushing back and a multiplication factor of M. Then the number of reallocations will be roughly log base M of n (log_M(n)). And the ith reallocation will cost proportional to M^i (M to the ith power). Then the total time of all the pushbacks will be M^1 + M^2 + ... M^(log_M(n)). The number of pushbacks is n, and thus you get this series (which is a geometric series, and reduces to roughly (nM)/(M-1) in the limit) divided by n. This is roughly a constant, M/(M-1).

For large values of M you will overshoot a lot and allocate much more than you need reasonably often (which I mentioned above). For small values of M (close to 1) this constant M/(M-1) becomes large. This factor directly affects the average time.

edited May 23 '17 at 12:25

Community

1
1

answered Jul 01 '11 at 16:37

Chris A.

6,817
2
25
43

Why is doubling an allocation with a 10000 element vector any worse than allocating a new block that'll hold some other number of elements (greater than 10000)? – Michael Burr Jul 01 '11 at 16:41
So you're saying that the real problem with having the factor too large is that you'll just hog too much memory? Or am I missing the point? You're right, the real cost is probably the copying that occurs after the reallocation. – Chris A. Jul 01 '11 at 16:44
Because it wastes 2x10000-1 elements of space as opposed to (some smaller factor)x10000-1 elements. – ltjax Jul 01 '11 at 16:45
1

It's a bit of a tradeof between memory consumption and constant complexity factors. – ltjax Jul 01 '11 at 16:46
thnx...i was looking into the running time analysis...hadnt thot much about memory...@chris i think your intuition of the worst and average case is probably the right way to go... – jemmanuel Jul 01 '11 at 16:56
1

I don't see how this answers the question at all. But the OP has indicated that it does, so I'm not downvoting. Just sayin'. :-) – Cheers and hth. - Alf Jul 01 '11 at 17:09
1

@Chris A: actually, an optimization (memory-wise) consists in having a different constant for several intervals of size. You grow fast to begin with, and reduces the factor when getting large. As long as this suite of factor always remains `> 1` (even its limit), then the complexity is still guaranteed. I love **Adaptative Algorithms**. – Matthieu M. Jul 01 '11 at 17:41
Thanks for all the comments. This response was made to give some high level intuition as for how the factor is chosen, not as a detailed analysis. – Chris A. Jul 01 '11 at 17:45
The problem with doubling the allocated size is that you will never ever be able to reuse the previously freed memory. If you keep the reallocations <= the golden ratio, you will eventually get enough space to reuse what you have freed. – Bo Persson Jul 01 '11 at 17:47
See also [what-is-the-ideal-growth-rate-for-a-dynamically-allocated-array](http://stackoverflow.com/questions/1100311/what-is-the-ideal-growth-rate-for-a-dynamically-allocated-array) Naturally, Jon Skeet answered this already 2 years ago. :-) – Bo Persson Jul 01 '11 at 17:53

score 8 · Answer 2 · answered Jul 01 '11 at 17:19

You can do the math to try to figure how this kind of thing works.

A popular method to work with asymptotic analysis is the Bankers method. What you do is markup all your operations with an extra cost, "saving" it for later to pay for an expensive operation latter on.

Let's make some dump assumptions to simplify the math:

Writing into an array costs 1. (Same for inserting and moving between arrays)
Allocating a larger array is free.

And our algorithm looks like:

function insert(x){
    if n_elements >= maximum array size:
         move all elements to a new array that
         is K times larger than the current size
    add x to array
    n_elements += 1

Obviously, the "worst case" happens when we have to move the elements to the new array. Let's try to amortize this by adding a constant markup of d to the insertion cost, bringing it to a total of (1 + d) per operation.

Just after an array has been resized, we have (1/K) of it filled up and no money saved. By the time we fill the array up, we can be sure to have at least d * (1 - 1/K) * N saved up. Since this money must be able to pay for all N elements being moved, we can figure out a relation between K and d:

d*(1 - 1/K)*N = N
d*(K-1)/K = 1
d = K/(K-1)

A helpful table:

k    d     1+d(total insertion cost)
1.0  inf   inf
1.1  11.0  12.0
1.5  3.0   4.0
2.0  2.0   3.0
3.0  1.5   2.5
4.0  1.3   2.3
inf  1.0   2.0

So from this you can get a rough mathematician's idea of how the time/memory tradeoff works for this problem. There are some caveats, of course: I didn't go over shrinking the array when it gets less elements, this only covers the worst case where no elements are ever removed and the time costs of allocating extra memory weren't accounted for.

They most likely ran a bunch of experimental tests to figure this out in the end making most of what I wrote irrelevant though.

score 1 · Answer 3 · answered Jul 01 '11 at 17:06

Uhm, the analysis is really simple when you're familiar with number systems, such as our usual decimal one.

For simplicity, then, assume that each time the current capacity is reached, a new 10x as large buffer is allocated.

If the original buffer has size 1, then the first reallocation copies 1 element, the second (where now the buffer has size 10) copies 10 elements, and so on. So with five reallocations, say, you have 1+10+100+1000+10000 = 11111 element copies performed. Multiply that by 9, and you get 99999; now add 1 and you have 100000 = 10^5. Or in other words, doing that backwards, the number of element copies performed to support those 5 reallocations has been (10^5-1)/9.

And the buffer size after 5 reallocations, 5 multiplications by 10, is 10^5. Which is roughly a factor of 9 larger than the number of element copy operations. Which means that the time spent on copying is roughly linear in the resulting buffer size.

With base 2 instead of 10 you get (2^5-1)/1 = 2^5-1.

And so on for other bases (or factors to increase buffer size by).

Cheers & hth.

Amortized analysis of std::vector insertion

3 Answers3

Linked