26

I have written basic python snippets to first insert values in a list and then reverse them. I found that there was a huge difference of speed of execution between insert and append methods.

Snippet 1:

L = []
for i in range(10**5):
 L.append(i)
L.reverse()

Time taken to execute this :

real    0m0.070s
user    0m0.064s
sys         0m0.008s

Snippet 2:

l = []
for i in range(10**5):
 l.insert(0,i)

Time taken to execute:

real    0m5.645s
user    0m5.516s
sys         0m0.020s

I expected the snippet 2 to perform much better than snippet1, since I am performing the reverse operation directly by inserting the numbers before. But the time taken says otherwise. I fail to understand why the latter method takes more time to execute, even though the method looks more elegant. Does any one have any explanation for this?

ballade4op52
  • 2,142
  • 5
  • 27
  • 42
Rahul
  • 11,129
  • 17
  • 63
  • 76

7 Answers7

42

Here is the complete answer from Duncan Booth:

A list is implemented by an array of pointers to the objects it contains.

Every time you call 'insert(0, indx)', all of the pointers already in the list have to be moved up once position before the new one can be inserted at the beginning.

When you call 'append(indx)' the pointers only have to be copied if there isn't enough space in the currently allocated block for the new element. If there is space then there is no need to copy the existing elements, just put the new element on the end and update the length field. Whenever a new block does have to be allocated that particular append will be no faster than an insert, but some extra space will be allocated just in case you do wish to extend the list further.

If you expected insert to be faster, perhaps you thought that Python used a linked-list implementation. It doesn't do this, because in practice (for most applications) a list based implementation gives better performance.

I actually have nothing else to add.

ballade4op52
  • 2,142
  • 5
  • 27
  • 42
Andrei
  • 55,890
  • 9
  • 87
  • 108
24

Note that your results will depend on the precise Python implementation. cpython (and pypy) automatically resize your list and overprovision space for future appends and thereby speed up the append furthermore.

Internally, lists are just chunks of memory with a constant size (on the heap). Sometimes you're lucky and can just increase the size of the chunk, but in many cases, an object will already be there. For example, assume you allocated a chunk of size 4 for a list [a,b,c,d], and some other piece of code allocated a chunk of size 6 for a dictionary:

Memory  0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
       |a b c d| | dictionary |

Assume your list has 4 elements, and another one is added. Now, you can simply resize the list to size 5:

Memory  0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
       |a b c d e| dictionary |

However, what do you do if you need another element now?

Well, the only thing you can do is acquire a new space and copy the contents of the list.

Memory 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
                | dictionary |a  b  c  d  e  f |

Note that if you acquire space in bulk (the aforementioned overprovisioning), you'll only need to resize (and potentially copy) the list every now and then.

In contrast, when you insert at position 0, you always need to copy your list. Let's insert x:

Memory  0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
orig   |a b c d| |dictionary|
after  |x a b c d|dictionary|

Although there was enough space to append x at the end, we had to move (not even copy, which may be less expensive in memory) all the other values.

phihag
  • 278,196
  • 72
  • 453
  • 469
7

If you need a datastructure that is as efficient in inserting at the start as it is in appending, then you should consider a deque.

Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895
3

I've learned a trick to insert x into the beginning of a list in "Python Pocket Reference":

l[:0] = [x]

It must be somehow very similar to l.insert(0, x), but when I try to comparing three options: append(x), insert(0, x) and l[:0] = [x], the last option performs a bit faster than the second one.

Here is the test code and the result

import time
def test():
    n = 10**5


    t0 = time.time()
    l = []
    for i in xrange(n): l.append(i)
    t1 = time.time() - t0
    print 'appending: %.5f' % t1


    t0 = time.time()
    l = []
    for i in xrange(n): l.insert(0, i)
    t2 = time.time() - t0
    print 'insert to 0: %.5f' % t2

    t0 = time.time()
    l = []
    for i in xrange(n): l[:0] = [i]
    t3 = time.time() - t0
    print 'set slice: %.5f' % t3

    return t1, t2, t3


if __name__ == '__main__':
    t = [0] * 3
    ntimes = 10

    for _ in xrange(ntimes):
        ti = test()

        for i in xrange(3):
            t[i] += ti[i]

    t = [i/ntimes for i in t]
    print 'average time:', t

average time

[0.011755657196044923, 4.1943151950836182, 3.3254094839096071]

Why it is about 25% faster than the insert(0, x)? I've tried to swap the code block of estimating t1, t2, t3, but the result is the same, so it is not about caching the list.

Here, it states that setting slice takes O(k + n)

wall-e
  • 7,180
  • 2
  • 16
  • 9
  • 4
    If you look at the generated bytecode, the `l[:0] = [x]` uses primitive operations directly, while `l.insert(0, x)` generates a function call, that is much slower in python, because they can be aways dinamically overiden. – ReneSac Jan 14 '13 at 19:26
  • On the other hand, the function call is much more readable and explicit about your intent, so you should use that unless you profile your program and find this line of code being a bottleneck. As you saw, apropriate use of the data structure is much more important for performance than being a function call or not. – ReneSac Jan 14 '13 at 19:35
  • Try this if you need to insert at the top. `arr = [1] + [2, 3] (concat)` Thanks for the time calc, btw. I used your code to calculate the average time taken, including the above method. Below are the results. `average time: [0.01092998186747233, 2.2966763178507485, 0.0010641415913899739, 1.2037852605183919]` 0.0010641415913899739 is the time taken for the array concatenation method. – Avid Programmer May 08 '20 at 10:56
3

Insert method appropriately implementing in Queue . FIFO operation, inserts at the front of the list. ex :. items.insert(0,item)

Append method appropriately implementing in stack . LIFO operation, inserts at the end of the list. ex :. items.append(item)

When we are using insert data thru INSERT method make sure all indexes are re-sequenced.

Sunil
  • 171
  • 2
  • 4
0

TL;DR: In terms of performance, if list.append() is feasible, use it over list.insert().

Additionally, collections.deque could yield better results in terms of appending and popping an elements on either end.

If you are using CPython and okay with making changes to existing toolchain, consider using PyPy for better performance.


I had some issue with performance of adding elements to a large list in Python. So, I conducted performance tests with undermentioned methods for appending elements to the list.

CPython's implementation of list is dynamic arrays. Which comes with an amortized O(1) time complexity for append, O(n) for insert, and O(1) for accessing. Considering the cost, if adding an element to the end of the list is desirable, append seems to be the obvious choice. However, the cost for append is O(1) amortized, not plain O(1).

Along with the built-in list data type, collections.deque (doubly ended queue) was also included in the performance test. It is a list-like container that supports fast appends and pops on either end.

If you want to further understand the Python list implementation, this post might be useful. @Lesya summarized the post here.

Result

insert vs append

Unless insert is a must, use append for adding elements to the list. For a list size of 0.1M, insert took more than a second on my laptop.

Focus of performance test was appending elements to the (end of the) list.

Adding an element to the end of the list

Both insert and append yielded a near-linear trend in processing time for various sizes of the list. However, regardless of the list size differences, append showed about 8% faster processing time than insert to the end of the list.

collections.deque showed over 20% faster processing time for list sizes over 1M. Smaller list sizes showed dismissible difference.

CPython vs PyPy

For a list size of 0.01M, append was 6.24X faster (429µs vs 68.7µs) on PyPy. deque showed even bigger increase of 16.05X faster (419 µs vs 26.1 µs). However, it may be worth mentioning that PyPy uses JIT and different gc mechanism. Depends on how codes are written, increment of speed may not be achievable.

Conclusion

For repetitive appending of elements to the data object, for list data type, it is recommended to use append. And, if collections.deque suits your needs, use it.

If you are using CPython, and okay with making changes to existing toolchain, consider using PyPy for better performance.

You might also want to consider following two packages, skiplist that support performant append-like functionality, and blist, an implementation of orderedstructs, which provides better performance for modifying large lists.

Output

Env: CPython 3.10.9

list length = 0.01M
lst_insert: 481 µs ± 2.32 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
lst_insert_batch: 522 µs ± 1.75 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
lst_append: 429 µs ± 1.31 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
lst_append_batch: 473 µs ± 801 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
deque_append: 419 µs ± 2.46 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

list length = 0.1M
lst_insert: 4.61 ms ± 21.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
lst_insert_batch: 5.03 ms ± 27.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
lst_append: 4.21 ms ± 29.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
lst_append_batch: 4.68 ms ± 65.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
deque_append: 4.14 ms ± 21.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

list length = 1M
lst_insert: 59.6 ms ± 415 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
lst_insert_batch: 67.4 ms ± 199 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
lst_append: 55.2 ms ± 273 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
lst_append_batch: 60.2 ms ± 305 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
deque_append: 44.5 ms ± 201 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

list length = 10M
lst_insert: 621 ms ± 3.24 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
lst_insert_batch: 813 ms ± 19.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
lst_append: 573 ms ± 3.18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
lst_append_batch: 721 ms ± 3.34 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
deque_append: 445 ms ± 2.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

list length = 100M
lst_insert: 6.04 s ± 89.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
lst_insert_batch: 8.04 s ± 43.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
lst_append: 5.53 s ± 32.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
lst_append_batch: 7.34 s ± 41.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
deque_append: 4.48 s ± 7.25 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Env: PyPy 3.9-v7.3.11

list length = 0.01M
lst_insert: 78.8 µs ± 802 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
lst_insert_batch: 167 µs ± 85.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
lst_append: 68.7 µs ± 1.48 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
lst_append_batch: 164 µs ± 88.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
deque_append: 26.1 µs ± 284 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Code used

import timeit

def lst_insert(len, v=0):
    lst = []
    for i in range(len):
        lst.insert(i, v)
    return lst

def lst_insert_batch(len, v=0, factor=10):
    lst = []
    tmp_lst = []
    for _ in range(factor):
        for i in range(int(len/factor)):
            tmp_lst.insert(i, v)
        lst += tmp_lst
        tmp_lst = []
    return lst

def lst_append(len, v=0):
    lst = []
    for i in range(len):
        lst.append(v)
    return lst

def lst_append_batch(len, v=0, factor=10):
    lst = []
    tmp_lst = []
    for _ in range(factor):
        for i in range(int(len/factor)):
            tmp_lst.append(v)
        lst += tmp_lst
        tmp_lst = []
    return lst

def deque_append(len, v=0):
    from collections import deque
    dq = deque(maxlen=len)
    for i in range(len):
        dq.append(v)
    return dq

for len in [1e4, 1e5, 1e6, 1e7, 1e8]:
    len = int(len)

    %timeit lst_insert(len)
    %timeit lst_insert_batch(len)
    %timeit lst_append(len)
    %timeit lst_append_batch(len)
    %timeit deque_append(len)
h2ku
  • 947
  • 8
  • 11
-1

https://wiki.python.org/moin/TimeComplexity go check here to see all the methods for a list and its time complexity