79

As seen in the documentation for TimeComplexity, Python's list type is implemented using an array.

So if an array is being used and we do a few appends, eventually you will have to reallocate space and copy all the information to the new space. After all that, how can it be O(1) worst case?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
ohad edelstain
  • 1,425
  • 2
  • 14
  • 22
  • 4
    http://cs.stackexchange.com/questions/9380/why-is-push-back-in-c-vectors-constant-amortized – rlbond Oct 09 '15 at 18:26

3 Answers3

131

It's amortized O(1), not O(1).

Let's say the list reserved size is 8 elements and it doubles in size when space runs out. You want to push 50 elements.

The first 8 elements push in O(1). The nineth triggers reallocation and 8 copies, followed by an O(1) push. The next 7 push in O(1). The seventeenth triggers reallocation and 16 copies, followed by an O(1) push. The next 15 push in O(1). The thirty-third triggers reallocation and 32 copies, followed by an O(1) push. The next 31 push in O(1). This continues as the size of list is doubled again at pushing the 65th, 129th, 257th element, etc..

So all of the pushes have O(1) complexity, we had 64 copies at O(1), and 3 reallocations at O(n), with n = 8, 16, and 32. Note that this is a geometric series and asymptotically equals O(n) with n = the final size of the list. That means the whole operation of pushing n objects onto the list is O(n). If we amortize that per element, it's O(n)/n = O(1).

Yang Zhou
  • 3
  • 2
rlbond
  • 65,341
  • 56
  • 178
  • 228
47

If you look at the footnote in the document you linked, you can see that they include a caveat:

These operations rely on the "Amortized" part of "Amortized Worst Case". Individual actions may take surprisingly long, depending on the history of the container.

Using amortized analysis, even if we have to occasionally perform expensive operations, we can get a lower bound on the 'average' cost of operations when you consider them as a sequence, instead of individually.

So, any individual operation could be very expensive - O(n) or O(n^2) or something even bigger - but since we know these operations are rare, we guarantee that a sequence of O(n) operations can be done in O(n) time.

Jacob Ritchie
  • 1,261
  • 11
  • 22
  • I understand what you linked here, but that means that "Amortized Worst Case", is actually "Amortized Average Case", because if lets say that every n appends you need to do a big allocation+copy that means that you are doing an average of o(1) for the append, and your worst case is o(n) – ohad edelstain Oct 09 '15 at 18:33
  • 4
    No, there is a difference between amortized average and amortized worst case. Amortized average case would be the average runtime of a sequence divided by the number of operations in the sequence. Amortized worst case is the worst possible runtime of a sequence divided by the number of operations in the sequence. (You see that 'average' is being used in two different ways, which is a little confusing.) – Jacob Ritchie Oct 09 '15 at 18:37
  • The amortized cost is a different thing than the cost of each individual operation, and it can be lower - the whole point is that by 'averaging' the sequence, you get a more useful bound. – Jacob Ritchie Oct 09 '15 at 18:39
3

This is very easy.

We can calculate this by accumulating the total time of appending n elements into an arraylist and divide it by n.

Firstly, we need to relocate log(n) times, and every relocation is doubled by 2. So we have a proportional series whose ratio is 2, and the length is log(n).

The sum of a proportional series is a(1-r^n)/(1-r). So the total time of relocation is (1-n)/(1-2)=n. The time complexity would be n/n=1.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Isaac Chou
  • 31
  • 1