1

When considering options for creating lists there are list comprehensions, list() constructor, and the [] literal syntax.

I found an article where it says that list comprehensions are transformed into functions that call list.append() under the hood.

transformation of LC into function equivalent

So this has given me the impression that append() is called on the list upon each iteration through the list, just as say in a similar implementation you could write this in using a for loop if you wanted to instead.

However, I've now found a prior answer here on SO which says:

The list comprehension version takes advantage of the special bytecode LIST_APPEND which calls PyList_Append directly for us. Hence it avoids an attribute lookup to list.append and a function call at the Python level.

So, when you write a list comprehension such as the above squares example, is it calling append() on each iteration, or not?

What does an attribute lookup to list.append mean, versus the actual c level call to list.append()?

cheznead
  • 2,589
  • 7
  • 29
  • 50
  • It isn't clear how this question is anything other than a duplicate of the question that you link to. You seem to be asking if the accepted answer in that question is valid. – John Coleman Dec 31 '21 at 17:48
  • No, not at all questioning the accepted answer's validity! I've reworded my question since pinning down my confusion. – cheznead Dec 31 '21 at 17:50
  • 2
    When something like "squares.append" in code appears, it always requires to check if "squares" is a list and retrieve the function referenced by the attribute "append" and doing some other things. In a list comprehension the runtime knows it is working on a list and it knows that the "append" method of a list should be used without looking it up. – Michael Butscher Dec 31 '21 at 18:07
  • 2
    When you do something like `some_object.whatever` that is an *attribute lookup*, which require searching for that attribute at runtime. Note, you can accomplish the same thing by using `whatever = some_object.whatever` – juanpa.arrivillaga Dec 31 '21 at 19:37

2 Answers2

2

I think the append append() method won't be called inside python.

It's been called inside the compiler implicitly:

In [17]: class list2(list):
    ...:     def append(self, *args, **kwargs):
    ...:         print("I have been called")
    ...:         super().append(*args, **kwargs)
    ...: 

In [18]: a = (i for i in range(10))

In [19]: l = list2(a)

In [20]: l
Out[20]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [21]: l.append(10)
I have been called

In [22]: l
Out[22]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

I checked the disassembly of its code:

In [28]: def f():
    ...:     return [i for i in range(10)]
    ...: 

In [29]: dis.dis(f)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f06bf957d60, file "<ipython-input-28-461835cd83dc>", line 2>)
              2 LOAD_CONST               2 ('f.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (10)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x7f06bf957d60, file "<ipython-input-28-461835cd83dc>", line 2>:
  2           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                 4 (to 14)
              6 STORE_FAST               1 (i)
              8 LOAD_FAST                1 (i)
             10 LIST_APPEND              2
             12 JUMP_ABSOLUTE            2 (to 4)
        >>   14 RETURN_VALUE

If you look at the second part, there is disassembly of the object being returned. There is a FOR_ITER which is closed with a JUMP_ABSOLUTE. The numbers like (to 4) are refering where to jump. It's iterating all over, and appending at the same time.

Mohammad Jafari
  • 1,742
  • 13
  • 17
2

The dis module can be used to inspect the internal python bytecode directly. Here are the two functions (modified slightly to be more self consistent), and the resulting bytecodes that are sent to the python interpreter after they are parsed:

def comprehension():
    squares = [num for num in range(1,11) if num % 2 == 0]
    return squares
    
def loop():
    squares = []
    for num in range(1,11):
        if num % 2 == 0:
            squares.append(num)
    return squares

Then calling dis.dis(comprehension):

>>> dis(comprehension)
 10           0 LOAD_CONST               1 (<code object <listcomp> at 0x0000021EB480BBE0, file "C:\Users\aaron\Documents\scripts\python\untitled2.py", line 10>)
              2 LOAD_CONST               2 ('comprehension.<locals>.<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (1)
             10 LOAD_CONST               4 (11)
             12 CALL_FUNCTION            2
             14 GET_ITER
             16 CALL_FUNCTION            1
             18 STORE_FAST               0 (squares)

 11          20 LOAD_FAST                0 (squares)
             22 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x0000021EB480BBE0, file "C:\Users\aaron\Documents\scripts\python\untitled2.py", line 10>:
 10           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                20 (to 26)
              6 STORE_FAST               1 (num)
              8 LOAD_FAST                1 (num)
             10 LOAD_CONST               0 (2)
             12 BINARY_MODULO
             14 LOAD_CONST               1 (0)
             16 COMPARE_OP               2 (==)
             18 POP_JUMP_IF_FALSE        4
             20 LOAD_FAST                1 (num)
             22 LIST_APPEND              2
             24 JUMP_ABSOLUTE            4
        >>   26 RETURN_VALUE

Interpreting this output takes a little practice, but the general flow is that the "list comprehension function" is created first (MAKE_FUNCTION jumps down to 2nd "disassembly" block), which takes care of creating a new list BUILD_LIST, iterating over an iterator FOR_ITER, testing the conditional POP_JUMP_IF_FALSE, and (when the conditional is true) directly appending to the list LIST_APPEND. After creating this "comprehension function" we return to the main function to create the iterator which is then passed to the function we created. Finally the result gets stored to a variable and that variable is returned.


The loop function is actually fewer bytecode instructions, but don't necessarily interpret that as being faster or better... some bytecode instructions are more complicated / slower than others. Only actual profiling can actually tell you the difference in speed.

>>> dis(loop)
 14           0 BUILD_LIST               0
              2 STORE_FAST               0 (squares)

 15           4 LOAD_GLOBAL              0 (range)
              6 LOAD_CONST               1 (1)
              8 LOAD_CONST               2 (11)
             10 CALL_FUNCTION            2
             12 GET_ITER
        >>   14 FOR_ITER                26 (to 42)
             16 STORE_FAST               1 (num)

 16          18 LOAD_FAST                1 (num)
             20 LOAD_CONST               3 (2)
             22 BINARY_MODULO
             24 LOAD_CONST               4 (0)
             26 COMPARE_OP               2 (==)
             28 POP_JUMP_IF_FALSE       14

 17          30 LOAD_FAST                0 (squares)
             32 LOAD_METHOD              1 (append)
             34 LOAD_FAST                1 (num)
             36 CALL_METHOD              1
             38 POP_TOP
             40 JUMP_ABSOLUTE           14

 18     >>   42 LOAD_FAST                0 (squares)
             44 RETURN_VALUE

Following the path of this function is a little bit more straightforward, but the key difference is really that python has to look up the "append" method from "squares" before it can call it, because it doesn't inherently know that "squares" is a list, and therefore can't just call LIST_APPEND

Aaron
  • 10,133
  • 1
  • 24
  • 40