24

The best way to explain my question is with an example:

example.py:

class A(object):
    integers = [1, 2, 3]
    singles = [i for i in integers]

class B(object):
    integers = [1, 2, 3]
    pairs = [(i, j) for i in integers for j in integers]

When I run this under python 2 it works fine, but under python 3 I get a NameError for class B (but not class A):

$ python example.py
Traceback (most recent call last):
  File "example.py", line 6, in <module>
    class B(object):
  File "example.py", line 8, in B
    pairs = [(i, j) for i in integers for j in integers]
  File "example.py", line 8, in <listcomp>
    pairs = [(i, j) for i in integers for j in integers]
NameError: global name 'integers' is not defined

Why does only class B raise a NameError and why only under Python 3?

martineau
  • 119,623
  • 25
  • 170
  • 301
sn6uv
  • 599
  • 5
  • 19
  • Are you sure ? http://ideone.com/n87bWm – karthikr Nov 22 '13 at 04:25
  • 1
    @karthikr Seems to be related to the fact it's a class variable: http://ideone.com/7XfDwQ – jpmc26 Nov 22 '13 at 04:27
  • Tested in python 3.3.2 works fine. – aIKid Nov 22 '13 at 04:28
  • In 3, listcomps scope like genexps. If you replace `pairs = [(i,j) .. etc` with `pairs = list((i,j) for i in integers for j in integers)`, you should see the same `NameError` in 2. – DSM Nov 22 '13 at 04:29
  • @BrenBarn, that question isn't quite the same thing. Perhaps the bug is that I **can** access `integers` in class A? – sn6uv Nov 22 '13 at 04:41
  • Works in 2.5, 2.6 and 2.7. Fails in 3.0, 3.1, 3.2, 3.3 and 3.4alpha4. – dstromberg Nov 22 '13 at 04:46
  • `itertools.product` works well for this desrired usage pattern. For example, `pairs = [(i, j) for i, j in itertools.product(integers, integers)]` should work. (And better written as simply `pairs = itertools.product(…)` in this case.) – Reece Jul 12 '16 at 22:06

1 Answers1

25

Class scopes are a bit strange in Python 3, but it's for a good reason.

In Python 2, the iteration variables (i and j in your examples) leaked out of list comprehensions and would be included in the outside scope. This is because they were developed early in Python 2's design, and they were based on explicit loops. As an example of how this is unexpected, check the values of B.i and B.j in Python 2 where you didn't get an error!

In Python 3, list comprehensions were changed to prevent this leaking. They are now implemented with a function (which has its own scope) that is called to produce the list value. This makes them work the same as generator expressions, which have always been functions under the covers.

A consequence of this is that in a class, a list comprehension usually can't see any class variables. This is parallel to a method not being able to see class variables directly (only though self or the explicit class name). For example, calling the method in the class below will give the same NameError exception you are seeing in your list comprehension:

class Foo:
    classvar = "bar"
    def blah(self):
        print(classvar) # raises "NameError: global name 'classvar' is not defined"

There is an exception, however: The sequence being iterated over by the first for clause of a list comprehension is evaluated outside of the inner function. This is why your A class works in Python 3. It does this so that generators can catch non-iterable objects immediately (rather than only when next is called on them and their code runs).

But it doesn't work for the inner for clause in the two-level comprehension in class B.

You can see the difference if you disassemble some functions that create list comprehensions using the dis module:

def f(lst):
    return [i for i in lst]

def g(lst):
    return [(i, j) for i in lst for j in lst]

Here's the disassembly of f:

>>> dis.dis(f)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x0000000003CCA1E0, file "<pyshell#374>", line 2>) 
              3 LOAD_CONST               2 ('f.<locals>.<listcomp>') 
              6 MAKE_FUNCTION            0 
              9 LOAD_FAST                0 (lst) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 (1 positional, 0 keyword pair) 
             16 RETURN_VALUE       

The first three lines show f loading up a precompiled code block and creating a function out of it (it names it f.<locals>.<listcomp>). This is the function used to make the list.

The next two lines show the lst variable being loaded and an iterator being made from it. This is happening within f's scope, not the inner function's. Then the <listcomp> function is called with that iterator as its argument.

This is comparable to class A. It gets the iterator from the class variable integers, just like you can use other kinds of references to previous class members in the definition of a new member.

Now, compare the disassembly of g, which makes pairs by iterating over the same list twice:

>>> dis.dis(g)
  2           0 LOAD_CLOSURE             0 (lst) 
              3 BUILD_TUPLE              1 
              6 LOAD_CONST               1 (<code object <listcomp> at 0x0000000003CCA810, file "<pyshell#377>", line 2>) 
              9 LOAD_CONST               2 ('g.<locals>.<listcomp>') 
             12 MAKE_CLOSURE             0 
             15 LOAD_DEREF               0 (lst) 
             18 GET_ITER             
             19 CALL_FUNCTION            1 (1 positional, 0 keyword pair) 
             22 RETURN_VALUE         

This time, it builds a closure with the code object, rather than a basic function. A closure is a function with some "free" variables that refer to things in the enclosing scope. For the <listcomp> function in g, this works just fine, since its scope is a normal one. However, when you try to use the same sort of comprehension in class B, the closure fails, since classes don't let functions they contain see into their scopes in that way (as demonstrated with the Foo class above).

It's worth noting that not only inner sequence values cause this issue. As in the previous question linked to by BrenBarn in a comment, you'll have the same issue if a class variable is referred to elsewhere in the list comprehension:

class C:
    num = 5
    products = [i * num for i in range(10)] # raises a NameError about num

You don't, however, get an error from multi-level list comprehensions where the inner for (or if) clauses only refer to the results of the preceding loops. This is because those values aren't part of a closure, just local variables inside the <listcomp> function's scope.

class D:
    nested = [[1, 2, 3], [4, 5, 6]]
    flattened = [item for inner in nested for item in inner] # works!

Like I said, class scopes are a bit strange.

khelwood
  • 55,782
  • 14
  • 81
  • 108
Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • I don't quite understand. If a "list comprehension can't see any class variables", then why can I see `integers` in class A? Is it because I can access integers when 'calling' the outer list comprehension function but not within it's body? – sn6uv Nov 22 '13 at 04:45
  • 2
    @sn6uv: Because the object iterated over in the first `for` clause of a comprehension is evaluated in the outside scope, rather than the scope created by the comprehension. If you're thinking this is weird, you're right. This behavior is a consequence of a design decision made to help detect bugs earlier in generator expressions. – user2357112 Nov 22 '13 at 04:49
  • @sn6uv: Hmm, I seem to have answered the wrong part of the question. I thought you were asking the same thing as in the other question BrenBarn linked as a duplicate in a top level comment. It turns out to be something a bit stranger. I'll edit to update. – Blckknght Nov 22 '13 at 05:01
  • *In Python 2, the iteration variables (i and j in your examples) leaked out of list comprehensions...* --- indeed, I've just discovered it. This is ridiculous! – Sergey Orshanskiy Dec 13 '13 at 22:21
  • Nit: classes don't define a scope. The body of a `class` statement defines a temporary *namespace* outside of the scoping system, which is why it's not visible to the list comprehension. – chepner Jul 05 '22 at 12:34