216

Why is the output of the following two list comprehensions different, even though f and the lambda function are the same?

f = lambda x: x*x
[f(x) for x in range(10)]

and

[lambda x: x*x for x in range(10)]

Mind you, both type(f) and type(lambda x: x*x) return the same type.

martineau
  • 119,623
  • 25
  • 170
  • 301
user763191
  • 2,187
  • 3
  • 13
  • 4
  • `[lambda x: x*x for x in range(10)]` is faster than the first one, since it does not call an outside loop function, f repeatedly. – riza May 20 '11 at 18:50
  • @Selinap: ...no, instead you're creating a brand spanking new function each and every time through the loop. ...and the overhead of creating this new function, then calling is a little slower (on my system anyway). – Gerrat May 20 '11 at 19:00
  • @Gerrat: Even with overhead, it is still faster. But, of course `[x*x for x in range(10)]` is better. – riza May 20 '11 at 19:13
  • 39
    I just entered here to get google foobar access :) – Gal Margalit Aug 27 '15 at 16:42

6 Answers6

354

The first one creates a single lambda function and calls it ten times.

The second one doesn't call the function. It creates 10 different lambda functions. It puts all of those in a list. To make it equivalent to the first you need:

[(lambda x: x*x)(x) for x in range(10)]

Or better yet:

[x*x for x in range(10)]
Guillaume Jacquenot
  • 11,217
  • 6
  • 43
  • 49
Winston Ewert
  • 44,070
  • 10
  • 68
  • 83
179

This question touches a very stinking part of the "famous" and "obvious" Python syntax - what takes precedence, the lambda, or the for of list comprehension.

I don't think the purpose of the OP was to generate a list of squares from 0 to 9. If that was the case, we could give even more solutions:

squares = []
for x in range(10): squares.append(x*x)
  • this is the good ol' way of imperative syntax.

But it's not the point. The point is W(hy)TF is this ambiguous expression so counter-intuitive? And I have an idiotic case for you at the end, so don't dismiss my answer too early (I had it on a job interview).

So, the OP's comprehension returned a list of lambdas:

[(lambda x: x*x) for x in range(10)]

This is of course just 10 different copies of the squaring function, see:

>>> [lambda x: x*x for _ in range(3)]
[<function <lambda> at 0x00000000023AD438>, <function <lambda> at 0x00000000023AD4A8>, <function <lambda> at 0x00000000023AD3C8>]

Note the memory addresses of the lambdas - they are all different!

You could of course have a more "optimal" (haha) version of this expression:

>>> [lambda x: x*x] * 3
[<function <lambda> at 0x00000000023AD2E8>, <function <lambda> at 0x00000000023AD2E8>, <function <lambda> at 0x00000000023AD2E8>]

See? 3 time the same lambda.

Please note, that I used _ as the for variable. It has nothing to do with the x in the lambda (it is overshadowed lexically!). Get it?

I'm leaving out the discussion, why the syntax precedence is not so, that it all meant:

[lambda x: (x*x for x in range(10))]

which could be: [[0, 1, 4, ..., 81]], or [(0, 1, 4, ..., 81)], or which I find most logical, this would be a list of 1 element - a generator returning the values. It is just not the case, the language doesn't work this way.

BUT What, If...

What if you DON'T overshadow the for variable, AND use it in your lambdas???

Well, then crap happens. Look at this:

[lambda x: x * i for i in range(4)]

this means of course:

[(lambda x: x * i) for i in range(4)]

BUT it DOESN'T mean:

[(lambda x: x * 0), (lambda x: x * 1), ... (lambda x: x * 3)]

This is just crazy!

The lambdas in the list comprehension are a closure over the scope of this comprehension. A lexical closure, so they refer to the i via reference, and not its value when they were evaluated!

So, this expression:

[(lambda x: x * i) for i in range(4)]

IS roughly EQUIVALENT to:

[(lambda x: x * 3), (lambda x: x * 3), ... (lambda x: x * 3)]

I'm sure we could see more here using a python decompiler (by which I mean e.g. the dis module), but for Python-VM-agnostic discussion this is enough. So much for the job interview question.

Now, how to make a list of multiplier lambdas, which really multiply by consecutive integers? Well, similarly to the accepted answer, we need to break the direct tie to i by wrapping it in another lambda, which is getting called inside the list comprehension expression:

Before:

>>> a = [(lambda x: x * i) for i in (1, 2)]
>>> a[1](1)
2
>>> a[0](1)
2

After:

>>> a = [(lambda y: (lambda x: y * x))(i) for i in (1, 2)]
>>> a[1](1)
2
>>> a[0](1)
1

(I had the outer lambda variable also = i, but I decided this is the clearer solution - I introduced y so that we can all see which witch is which).

Edit 2019-08-30:

Following a suggestion by @josoler, which is also present in an answer by @sheridp - the value of the list comprehension "loop variable" can be "embedded" inside an object - the key is for it to be accessed at the right time. The section "After" above does it by wrapping it in another lambda and calling it immediately with the current value of i. Another way (a little bit easier to read - it produces no 'WAT' effect) is to store the value of i inside a partial object, and have the "inner" (original) lambda take it as an argument (passed supplied by the partial object at the time of the call), i.e.:

After 2:

>>> from functools import partial
>>> a = [partial(lambda y, x: y * x, i) for i in (1, 2)]
>>> a[0](2), a[1](2)
(2, 4)

Great, but there is still a little twist for you! Let's say we wan't to make it easier on the code reader, and pass the factor by name (as a keyword argument to partial). Let's do some renaming:

After 2.5:

>>> a = [partial(lambda coef, x: coef * x, coef=i) for i in (1, 2)]
>>> a[0](1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: <lambda>() got multiple values for argument 'coef'

WAT?

>>> a[0]()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: <lambda>() missing 1 required positional argument: 'x'

Wait... We're changing the number of arguments by 1, and going from "too many" to "too few"?

Well, it's not a real WAT, when we pass coef to partial in this way, it becomes a keyword argument, so it must come after the positional x argument, like so:

After 3:

>>> a = [partial(lambda x, coef: coef * x, coef=i) for i in (1, 2)]
>>> a[0](2), a[1](2)
(2, 4)

I would prefer the last version over the nested lambda, but to each their own...

Edit 2020-08-18:

Thanks to commenter dasWesen, I found out that this stuff is covered in the Python documentation: https://docs.python.org/3.4/faq/programming.html#why-do-lambdas-defined-in-a-loop-with-different-values-all-return-the-same-result - it deals with loops instead of list comprehensions, but the idea is the same - global or nonlocal variable access in the lambda function. There's even a solution - using default argument values (like for any function):

>>> a = [lambda x, coef=i: coef * x for i in (1, 2)]
>>> a[0](2), a[1](2)
(2, 4)

This way the coef value is bound to the value of i at the time of function definition (see James Powell's talk "Top To Down, Left To Right", which also explains why mutable default values are shunned).

Tomasz Gandor
  • 8,235
  • 2
  • 60
  • 55
  • 47
    that is a cruel and unusual job interview question. – szeitlin Dec 11 '15 at 17:49
  • 1
    If my colleague didn't ask, I probably would never search for this answer – piggybox Nov 10 '16 at 01:33
  • 11
    Wow. I just got bitten badly by this absurd behavior. Thank you for your post! – Ant Oct 20 '17 at 11:31
  • 1
    Excellent answer. I just ran into this issue as well. On one hand it does point to a Python limitation, but on the other hand it might also be a code smell indicator. I'm using this solution for a toy project, but it might be a signal to restructure in a production environment. – ahota Jun 27 '18 at 20:33
  • 2
    For the sake of clarity and completeness you could write the last list comprehension like: `[partial(lambda i, x: i * x, i) for i in (1, 2)]` – josoler Aug 16 '19 at 11:15
  • Thanks @josoler, I added a section on using `partial` to my answer. – Tomasz Gandor Aug 30 '19 at 10:11
  • Just for completeness' sake: The function objects generated by the nested lambda solution are located at equally crazy addresses, e.g. ```.... at 0x1a1db776a8>```. – Marco Sep 30 '19 at 02:49
  • 4
    For many purposes, I think there is a python-internal way around it: Lambdas with default values. This should work: `[lambda x, i=i: x * i for i in range(4)]` https://docs.python.org/3.4/faq/programming.html#why-do-lambdas-defined-in-a-loop-with-different-values-all-return-the-same-result . That's good if you don't mind the additional, overwritable parameter. – dasWesen Aug 15 '20 at 07:27
  • 1
    For the second time in my coding career, this issue has cost me a few hours. I found my way to this answer and discovered I'd already upvoted it some time back. there's no good solution here but at least your response has the appropriate degree of indignation to it. – Chris J Harris Sep 27 '22 at 03:35
23

The big difference is that the first example actually invokes the lambda f(x), while the second example doesn't.

Your first example is equivalent to [(lambda x: x*x)(x) for x in range(10)] while your second example is equivalent to [f for x in range(10)].

Gabe
  • 84,912
  • 12
  • 139
  • 238
15

The first one

f = lambda x: x*x
[f(x) for x in range(10)]

runs f() for each value in the range so it does f(x) for each value

the second one

[lambda x: x*x for x in range(10)]

runs the lambda for each value in the list, so it generates all of those functions.

zellio
  • 31,308
  • 1
  • 42
  • 61
14

People gave good answers but forgot to mention the most important part in my opinion: In the second example the X of the list comprehension is NOT the same as the X of the lambda function, they are totally unrelated. So the second example is actually the same as:

[Lambda X: X*X for I in range(10)]

The internal iterations on range(10) are only responsible for creating 10 similar lambda functions in a list (10 separate functions but totally similar - returning the power 2 of each input).

On the other hand, the first example works totally different, because the X of the iterations DO interact with the results, for each iteration the value is X*X so the result would be [0,1,4,9,16,25, 36, 49, 64 ,81]

Tidhar seifer
  • 141
  • 1
  • 5
8

The other answers are correct, but if you are trying to make a list of functions, each with a different parameter, that can be executed later, the following code will do that:

import functools
a = [functools.partial(lambda x: x*x, x) for x in range(10)]

b = []
for i in a:
    b.append(i())

In [26]: b
Out[26]: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

While the example is contrived, I found it useful when I wanted a list of functions that each print something different, i.e.

import functools
a = [functools.partial(lambda x: print(x), x) for x in range(10)]

for i in a:
    i()
sheridp
  • 1,386
  • 1
  • 11
  • 24