The dis
module is useful for comparing the first two methods.
def list_mult():
return [0]*1000000
dis.dis(list_mult)
# 2 0 LOAD_CONST 1 (0)
# 3 BUILD_LIST 1
# 6 LOAD_CONST 2 (1000000)
# 9 BINARY_MULTIPLY
# 10 RETURN_VALUE
Here, the BINARY_MULTIPLY
instruction is used. On the other hand...
def list_range():
return [0 for _ in range(1000000)]
dis.dis(list_range)
# 2 0 BUILD_LIST 0
# 3 LOAD_GLOBAL 0 (range)
# 6 LOAD_CONST 1 (1000000)
# 9 CALL_FUNCTION 1
# 12 GET_ITER
# >> 13 FOR_ITER 12 (to 28)
# 16 STORE_FAST 0 (_)
# 19 LOAD_CONST 2 (0)
# 22 LIST_APPEND 2
# 25 JUMP_ABSOLUTE 13
# >> 28 RETURN_VALUE
This function explicitly constructs a loop, and then loads 0 and appends it to the working list in each iteration. This is going to be a lot slower.
It should be noted that these two construction methods are not equivalent, particularly when the value inside the list is mutable. For example, [object()] * 10
will give you a list of 10 of the same object, while [object() for _ in range(10)]
will give you a list of 10 distinct objects.
Regarding the numpy
example, this operation is kind of the worst-case for numpy
. There is a lot of overhead in constructing and converting numpy
arrays so that the vectorized operations can be fast (as noted in the comments).