We can get a bit more insight into what's going on by disassembling the each of the code examples with the dis
module.
Standard Append:
dis.dis("""
list_of_interest = []
for i in range(1000):
list_of_interest.append(i)
""")
gives us
2 0 BUILD_LIST 0
2 STORE_NAME 0 (list_of_interest)
3 4 SETUP_LOOP 26 (to 32)
6 LOAD_NAME 1 (range)
8 LOAD_CONST 0 (1000)
10 CALL_FUNCTION 1
12 GET_ITER
>> 14 FOR_ITER 14 (to 30)
16 STORE_NAME 2 (i)
4 18 LOAD_NAME 0 (list_of_interest)
20 LOAD_METHOD 3 (append)
22 LOAD_NAME 2 (i)
24 CALL_METHOD 1
26 POP_TOP
28 JUMP_ABSOLUTE 14
>> 30 POP_BLOCK
>> 32 LOAD_CONST 1 (None)
34 RETURN_VALUE
Looking closely we see were are calling the append method in a loop (just like we wrote). Digging into the implementation (here) we see that this eventually calls app1
which calls list_resize
which over allocates to try to reduce reallocations.
Other Two Examples:
As noted in other answers range
actually does implement a __len__
making
list_of_interest = list(range(1000))
and
b = tuple(range(1000))
list_of_interest = list(b)
work pretty similarly.
Disassembling the first gives:
1 0 LOAD_NAME 0 (list)
2 LOAD_NAME 1 (range)
4 LOAD_CONST 0 (1000)
6 CALL_FUNCTION 1
8 CALL_FUNCTION 1
10 STORE_NAME 2 (list_of_interest)
12 LOAD_CONST 1 (None)
14 RETURN_VALUE
and the second:
2 0 LOAD_NAME 0 (tuple)
2 LOAD_NAME 1 (range)
4 LOAD_CONST 0 (1000)
6 CALL_FUNCTION 1
8 CALL_FUNCTION 1
10 STORE_NAME 2 (b)
3 12 LOAD_NAME 3 (list)
14 LOAD_NAME 2 (b)
16 CALL_FUNCTION 1
18 STORE_NAME 4 (list_of_interest)
20 LOAD_CONST 1 (None)
22 RETURN_VALUE
In both cases we call the list
constructor defined here. We see that if the provided iterable to list
has a length, then we call list_preallocate_exact
which acts how you expect. Otherwise we call list_extend
on an empty list. In the case that our iterator is not a list or tuple (which in this case we know it isn't), we'll call PyObject_LengthHint
to guess the length of the iterator and preallocate space that way. Since the length hint might be too large or too small, we may need to allocate more memory as we loop through the iterator in the standard way and we may deallocate space after we're done looping in the case that the hint was too large.
Other situations:
We can play around with other examples, but we see that we'll hit the list
constructor above which has several special cases. Often we'll next go to list_extend
which also has several special cases. In the case where our iterator doesn't implement length or length hint, we default to a length hint of 8. Why 8? You'd have to ask the maintainers that one.
Conclusion:
If Python can know the needed length of the resulting list from the start, you'll only have one allocation. If it has to guess, you may have more than one or a deallocation. If it can't guess in the case of appending in a loop or being provided an iterator with no length or length hint, you'll have multiple allocations.