Why do Python yield statements form a closure?

Question

I have two functions that return a list of functions. The functions take in a number x and add i to it. i is an integer increasing from 0-9.

def test_without_closure():
    return [lambda x: x+i for i in range(10)]



def test_with_yield():
    for i in range(10):
        yield lambda x: x+i

I would expect test_without_closure to return a list of 10 functions that each add 9 to x since i's value is 9.

print sum(t(1) for t in test_without_closure()) # prints 100

I expected that test_with_yield would also have the same behavior, but it correctly creates the 10 functions.

print sum(t(1) for t in test_with_yield()) # print 55

My question is, does yielding form a closure in Python?

Try `sum(t(1) for t in list(test_with_yield()))`. You'll get `100`. When you are evaluating `t(1)` in your second sum, the generator has not yet advanced `i` to the next value. The execution of `test_with_yield` is paused and stored until the next value is requested. — Patrick Haugh, Nov 18 '16 at 20:01
Think of python's closures as always doing *reference* copy, not *value* copy, and you'll understand the behaviour... — Bakuriu, Nov 18 '16 at 22:39

sepp2k · Accepted Answer · 2017-04-04T15:05:55.530

29

Yielding does not create a closure in Python, lambdas create a closure. The reason that you get all 9s in "test_without_closure" isn't that there's no closure. If there weren't, you wouldn't be able to access i at all. The problem is that all closures contain a reference¹ to the same i variable, which will be 9 at the end of the function.

This situation isn't much different in test_with_yield. Why, then, do you get different results? Because yield suspends the run of the function, so it's possible to use the yielded lambdas before the end of the function is reached, i.e. before i is 9. To see what this means, consider the following two examples of using test_with_yield:

[f(0) for f in test_with_yield()]
# Result: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

[f(0) for f in list(test_with_yield())]
# Result: [9, 9, 9, 9, 9, 9, 9, 9, 9, 9]

What's happening here is that the first example yields a lambda (while i is 0), calls it (i is still 0), then advances the function until another lambda is yielded (i is now 1), calls the lambda, and so on. The important thing is that each lambda is called before the control flow returns to test_with_yield (i.e. before the value of i changes).

In the second example, we first create a list. So the first lambda is yielded (i is 0) and put into the list, the second lambda is created (i is now 1) and put into the list ... until the last lambda is yielded (i is now 9) and put into the list. And then we start calling the lambdas. So since i is now 9, all lambdas return 9.

¹ The important bit here is that closures hold references to variables, not copies of the value they held when the closure was created. This way, if you assign to the variable inside a lambda (or inner function, which create closures the same way that lambdas do), this will also change the variable outside of the lambda and if you change the value outside, that change will be visible inside the lambda.

edited Apr 04 '17 at 15:05

answered Nov 18 '16 at 20:02

sepp2k

363,768
54
674
675

What would you do to get a list of functions like `[lambda x: x+0, lambda x: x+1, ...]` – Patrick Haugh Nov 18 '16 at 20:03
Is there a better way than `[(lambda j: lambda x: x+j)(i) for i in range(10)]` – Patrick Haugh Nov 18 '16 at 20:06
2

See here: http://docs.python-guide.org/en/latest/writing/gotchas/ section "Late Binding Closures" – VPfB Nov 18 '16 at 20:07
@PatrickHaugh Preferably use a language where I can define local variables at block scope, but to give a more constructive answer: Use an inner function to create a new scope. Something like `def make_lambda(i): return lambda x: x+i` and then `return [make_lambda(i) for i in range(10)]`. – sepp2k Nov 18 '16 at 20:07
So yield actually evaluates what is being yielded when the yield keyword is used? – Alex Nov 18 '16 at 20:07
Doing just `[t for t in test_without_closure()]` shows me all the lambda functions have difference reference, that means they are created at each iteration. Then how all of them holds just the last value of `i`? – Moinuddin Quadri Nov 18 '16 at 20:08
4

@PatrickHaugh `[lambda x, i=i: x+i for i in range(10)]` would return the expected lambdas. – Ashwini Chaudhary Nov 18 '16 at 20:08
5

@MoinuddinQuadri The lambdas don't hold any value of i, they hold a reference to i. So if i changes, that change is visible through that reference. – sepp2k Nov 18 '16 at 20:09
@sepp2k can you edit into the answer that the lambdas hold a reference? it's a crucial piece to understand it. Nice explanation! – themistoklik Nov 18 '16 at 20:22
@themistoklik It's there already: "The problem is that all closures contain a reference to the same i variable", but I'll expand a bit on that. – sepp2k Nov 18 '16 at 20:30
It is true. Verified it using `id()` and added the demonstration [here](http://stackoverflow.com/a/40686011/2063361) – Moinuddin Quadri Nov 18 '16 at 21:13
1

It is not true that 'The reason that you get all 9s in "test_without_closure" isn't that there's no closure.' The lambda *is* a closure over i in *both* functions. – jacg Nov 19 '16 at 00:21
5

@jacg, I think you've misread the double negative. The questioner evidently believed that there's no closure, but that's wrong, so it isn't the reason for the all-9s result; the reason is that all the lambdas are closures, as you both stated. – deltab Nov 19 '16 at 01:03
1

@Alex `yield` yields a value, so its expression must be evaluated when execution hits the `yield` statement. – jamesdlin Nov 19 '16 at 02:34
Direct link to the section VPfB mentioned: "[Late Binding Closures](http://docs.python-guide.org/en/latest/writing/gotchas/#late-binding-closures)". – Kevin J. Chase Nov 19 '16 at 08:19
@deltab You're absolutely right, my misreading of the double negative made me write a pointless comment. I read the answer as claiming that there was no closure. Apologies to all. – jacg Nov 19 '16 at 11:31

jacg · Answer 2 · 2016-11-19T00:08:04.363

No, yielding has nothing to do with closures.

Here is how to recognize closures in Python: a closure is

a function
in which an unqualified name lookup is performed
no binding of the name exists in the function itself
but a binding of the name exists in the local scope of a function whose definition surrounds the definition of the function in which the name is looked up.

The reason for the difference in behaviour you observe is laziness, rather than anything to do with closures. Compare and contrast the following

def lazy():
    return ( lambda x: x+i for i in range(10) )

def immediate():
    return [ lambda x: x+i for i in range(10) ]

def also_lazy():
    for i in range(10):
        yield lambda x:x+i

not_lazy_any_more = list(also_lazy())

print( [ f(10) for f in lazy()             ] ) # 10 -> 19
print( [ f(10) for f in immediate()        ] ) # all 19
print( [ f(10) for f in also_lazy()        ] ) # 10 -> 19
print( [ f(10) for f in not_lazy_any_more  ] ) # all 19

Notice that the first and third examples give identical results, as do the second and the fourth. The first and third are lazy, the second and fourth are not.

Note that all four examples provide a bunch of closures over the most recent binding of i, it's just that in the first an third case you evaluate the closures before rebinding i (even before you've created the next closure in the sequence), while in the second and fourth case, you first wait until i has been rebound to 9 (after you've created and collected all the closures you are going to make), and only then evaluate the closures.

score 3 · Answer 3 · edited May 23 '17 at 12:02

Adding to @sepp2k's answer you're seeing these two different behaviours because the lambda functions being created don't know from where they have to get i's value. At the time this function is created all it knows is that it has to either fetch i's value from either local scope, enclosed scope, global scope or builtins.

In this particular case it is a closure variable(enclosed scope). And its value is changing with each iteration.

Check out LEGB in Python.

Now to why second one works as expected but not the first one?

It's because each time you're yielding a lambda function the execution of the generator function stops at that moment and when you're invoking it and it will use the value of i at that moment. But in the first case we have already advanced i's value to 9 before we invoked any of the functions.

To prove it you can fetch current value of i from the __closure__'s cell contents:

>>> for func in test_with_yield():
        print "Current value of i is {}".format(func.__closure__[0].cell_contents)
        print func(9)
...
Current value of i is 0
Current value of i is 1
Current value of i is 2
Current value of i is 3
Current value of i is 4
Current value of i is 5
Current value of i is 6
...

But instead if you store the functions somewhere and call them later then you will see the same behaviour as the first time:

from itertools import islice

funcs = []
for func in islice(test_with_yield(), 4):
    print "Current value of i is {}".format(func.__closure__[0].cell_contents)
    funcs.append(func)

print '-' * 20

for func in funcs:
    print "Now value of i is {}".format(func.__closure__[0].cell_contents)

Output:

Current value of i is 0
Current value of i is 1
Current value of i is 2
Current value of i is 3
--------------------
Now value of i is 3
Now value of i is 3
Now value of i is 3
Now value of i is 3

Example used by Patrick Haugh in comments also shows the same thing: sum(t(1) for t in list(test_with_yield()))

Correct way:

Assign i as a default value to lambda, default values are calculated when function is created and they won't change(unless it's a mutable object). i is now a local variable to the lambda functions.

>>> def test_without_closure():
        return [lambda x, i=i: x+i for i in range(10)]
...
>>> sum(t(1) for t in test_without_closure())
55

Why do Python yield statements form a closure?

3 Answers3

Now to why second one works as expected but not the first one?

Correct way:

Linked