25

I just read the answer to this question: Accessing class variables from a list comprehension in the class definition

It helps me to understand why the following code results in NameError: name 'x' is not defined:

class A:
    x = 1
    data = [0, 1, 2, 3]
    new_data = [i + x for i in data]
    print(new_data)

The NameError occurs because x is not defined in the special scope for list comprehension. But I am unable to understand why the following code works without any error.

class A:
    x = 1
    data = [0, 1, 2, 3]
    new_data = [i for i in data]
    print(new_data)

I get the output [0, 1, 2, 3]. But I was expecting this error: NameError: name 'data' is not defined because I was expecting just like in the previous example the name x is not defined in the list comprehension's scope, similarly, the name data would not be defined too in the list comprehension's scope.

Could you please help me to understand why x is not defined in the list comprehension's scope but data is?

Community
  • 1
  • 1
Lone Learner
  • 18,088
  • 20
  • 102
  • 200

2 Answers2

20

data is the source of the list comprehension; it is the one parameter that is passed to the nested scope created.

Everything in the list comprehension is run in a separate scope (as a function, basically), except for the iterable used for the left-most for loop. You can see this in the byte code:

>>> def foo():
...     return [i for i in data]
... 
>>> dis.dis(foo)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x105390390, file "<stdin>", line 2>)
              3 LOAD_CONST               2 ('foo.<locals>.<listcomp>')
              6 MAKE_FUNCTION            0
              9 LOAD_GLOBAL              0 (data)
             12 GET_ITER
             13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             16 RETURN_VALUE

The <listcomp> code object is called like a function, and iter(data) is passed in as the argument (CALL_FUNCTION is executed with 1 positional argument, the GET_ITER result).

The <listcomp> code object looks for that one argument:

>>> dis.dis(foo.__code__.co_consts[1])
  2           0 BUILD_LIST               0
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                12 (to 21)
              9 STORE_FAST               1 (i)
             12 LOAD_FAST                1 (i)
             15 LIST_APPEND              2
             18 JUMP_ABSOLUTE            6
        >>   21 RETURN_VALUE

The LOAD_FAST call refers to the first and only positional argument passed in; it is unnamed here because there never was a function definition to give it a name.

Any additional names used in the list comprehension (or set or dict comprehension, or generator expression, for that matter) are either locals, closures or globals, not parameters.

If you go back to my answer to that question, look for the section titled The (small) exception; or, why one part may still work; I tried to cover this specific point there:

There's one part of a comprehension or generator expression that executes in the surrounding scope, regardless of Python version. That would be the expression for the outermost iterable.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • This is a nice answer but it dissassembles a function, not a class. It does not make sense for class... Isn't `x` a local variable within the scope of `A`? – fmv1992 Mar 23 '21 at 12:04
  • @fmv1992: the OP links to a question that I answered covering how this works in classes. This answer is to be read **in addition to my answer there**, so I didn't see any point in repeating myself. The disassembly here illustrates why `data` **does** work, I included a disassembly to show *how* the list comprehension is implemented to show why `data` is not considered a variable in a parent scope. The disassembly is the same regardless of scope, and the point was to show how this works under the hood. – Martijn Pieters Mar 23 '21 at 12:12
-1

The dis.dis answer is interesting but it does not actually explain why that happens. Here it is, from a similar error:

If a name binding operation occurs anywhere within a code block, all uses of the name within the block are treated as references to the current block. This can lead to errors when a name is used within a block before it is bound. This rule is subtle. Python lacks declarations and allows name-binding operations to occur anywhere within a code block. The local variables of a code block can be determined by scanning the entire text of the block for name binding operations.

So in simple terms: data cannot refer to x because the block is not bound by that point. There's no way to refer to x: neither by x alone or A.x.

Source: python docs.

fmv1992
  • 322
  • 1
  • 4
  • 14
  • 1
    Why this happens was already answered in the question the OP links to. Note that this is is an answer I wrote: [Accessing class variables from a list comprehension in the class definition](https://stackoverflow.com/a/13913933); I cover the same issue there: *Everything in the list comprehension is run in a separate scope*, with the list comprehension source being the exception. – Martijn Pieters Mar 23 '21 at 12:13
  • 1
    More importantly, the OP is asking **why this doesn't apply to `data`**, as opposed to `x`. And the reason is that the result of evaluating the `data` expression, passed to `iter()`, is passed into the 'function' that is used to implement the list comprehension scope, and so is not subject to the rules about parent scopes. – Martijn Pieters Mar 23 '21 at 12:14
  • I didn't see the other question. I believe it would be better if the answers were self-contained, otherwise, we go from question A to B to C... But nice, thanks for the clarification. – fmv1992 Mar 23 '21 at 13:24
  • I firmly believe my answer *is* self contained, it covers the specific question asked here. I was giving you the context you may have missed that's in the *question*. – Martijn Pieters Mar 23 '21 at 13:25