I was optimizing a piece of code and figured out that list copying (shallow) was the bottleneck.
Now I am curious: why is list copying so much slower than copying an array
of 8-bytes? In my opinion there should be no difference. In both cases it should be just memcpy(dst, src, sizeof(int64_t)*len(src))
since pointers are 8 bytes long. But apparently Python does more work than I expected. Is it somehow related to GC? Or is it possible that lists are implemented as linked lists?
import array
import numpy as np
import timeit
n = 100*1000
lst = [i for i in range(n)]
arr = array.array('q', lst)
nmp = np.array(arr, dtype=np.int64)
assert(arr.itemsize == 8)
n_iter = 100000
print('=== copy() ===')
print('List of int:', timeit.timeit(stmt='lst.copy()', setup='from __main__ import lst', number=n_iter))
print('Array of 8-bytes:', timeit.timeit(stmt='arr.__copy__()', setup='from __main__ import arr', number=n_iter))
print('Numpy array of int64:', timeit.timeit(stmt='nmp.copy()', setup='from __main__ import nmp', number=n_iter))
The results:
=== copy() ===
List of int: 27.434935861998383
Array of 8-bytes: 2.6839109230022586
Numpy array of int64: 2.69919407800262