4

I am comparing different ways of appending items to a list in python. I tested on my computer and results will of course be different on other computers.

Option 1: 5.80 seconds only 50,000 items !

a = []
for i in range(50000):
    a = a + [i]

Option 2: 2.27 seconds 10,000,000 items

a = []
for i in range(10000000):
    a += [i]

Option 3: 1.53 seconds 10,000,000 items

a = []
for i in range(10000000):
    a.append(i)

It is no surprise that Option 1 is slow because it creates a new copy of the list every time.

Option 2 is much faster using the augmented assignment operator. It modifies the original list in place.

But Option 3 is still significantly faster. I expected using the append() method and the augmented assignment operator to have the same results.

Why is the append() method still so much faster then the += operator?

PS: My python version:

Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32

According to the answer of taskinoor the additional overhead is caused by creating a new list object [i] in every loop iteration. This seems reasonable to me. I created new test code trying to avoid the creation of a new object in every iteration. And yes, it is now faster, but still not so fast as using append() because now i have again the overhead of additional assignment. But i think it proves the point of taskinoor.

Option 4: 1.91 seconds 10,000,000 items (Improved version of Option 2)

a = []
b = [0]
for i in range(10000000):
    b[0] = i
    a += b
David
  • 1,204
  • 12
  • 16
  • I'd guess they differ in how much extra space they allocate when growing the underlying list. – Filip Haglund Oct 22 '16 at 11:39
  • 1
    You are creating a new list and extending vs a simple append so it is not too surprising that append is much faster. – Padraic Cunningham Oct 22 '16 at 12:06
  • For your edit, you are doing a setitem and then extending so again not exactly surprising. append does one job and does it well, there is never a realistic use case where you would use any of the other methods to append a single element to a list. – Padraic Cunningham Oct 22 '16 at 12:12

1 Answers1

3

The second method still needs to create the temporary list [i]. Look at the disassembled codes:

>>> import dis
>>> def func_2(a, i):
...     a += [i]
... 
>>> def func_3(a, i):
...     a.append(i)
... 
>>> dis.dis(func_2)
  2           0 LOAD_FAST                0 (a)
              3 LOAD_FAST                1 (i)
              6 BUILD_LIST               1
              9 INPLACE_ADD         
             10 STORE_FAST               0 (a)
             13 LOAD_CONST               0 (None)
             16 RETURN_VALUE        
>>> dis.dis(func_3)
  2           0 LOAD_FAST                0 (a)
              3 LOAD_ATTR                0 (append)
              6 LOAD_FAST                1 (i)
              9 CALL_FUNCTION            1
             12 POP_TOP             
             13 LOAD_CONST               0 (None)
             16 RETURN_VALUE        
>>> 

The second method needs 6 BUILD_LIST but the third one does not. There could be other reasons but this should be the main reason for method 3 being faster.

taskinoor
  • 45,586
  • 12
  • 116
  • 142
  • That makes sense to me. It is the [i] after the += operator which causes the overhead of creating a new list object. – David Oct 22 '16 at 12:02