1

To learn about lambdas I was following this tutorial, and ran into this example about calculating primes (python 2.x):

nums = range(2,50)
for i in range(2,8):
    nums = filter(lambda x: x == i or x % i, nums)

print (list(nums))

prints

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

However, while trying this in python 3.4 it produced unexpected behavior:

nums = range(2,50)
for i in range(2,8):
    nums = filter(lambda x: x == i or x % i , nums)

print(list(nums))

prints

[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 36, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47, 48]

I don't understand why there is a difference. I know filter returns a filter object in python 3 (instead of a list) but as far as I know that should not affect the outcome.

Removing the for loop produces the right result:

>>> nums = range(2,50)
>>> nums = filter(lambda x: x == 2 or x % 2, nums)
>>> nums = filter(lambda x: x == 3 or x % 3, nums)
>>> nums = filter(lambda x: x == 4 or x % 4, nums)
>>> nums = filter(lambda x: x == 5 or x % 5, nums)
>>> nums = filter(lambda x: x == 6 or x % 6, nums)
>>> nums = filter(lambda x: x == 7 or x % 7, nums)
>>> print(list(nums))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

I hope someone could enlighten me about this as I'm curious about what is going on.

ScootCork
  • 3,411
  • 12
  • 22

1 Answers1

5

This behavior results from a combination of two things. One is that, in Python 3, filter (and range) return objects that yield one value at a time, rather than precomputing all values. The other is that, in both Python 2 and 3, functions that reference names in enclosing scopes create closures over names, not values.

In your Python 3 version, on each loop iteration, you create a filter using a function (your lambda). Because the filter is "lazy", it stores the function and calls it later, whenever you ask for the filtered values. (In this case, that's when you call list(nums).) But that function has a reference to the variable i which is outside the function. So when filter calls the function, it calls it with the value of i that exists when you get the filtered values (i.e., when you call list(nums)), not at the time when you created the filter. This is why your result is missing all multiples of 7 (except 7): 7 is the last value in your i loop, so all your lambdas are checking for multiples of 7 by the time they are called.

One way to reproduce the Python 2 behavior, as Bhargav Rao said in a comment, is to change your lambda to filter list(nums) rather than nums. This forces every filter to "flush out" the previous filter rather than waiting to apply them all at the end. (This is effectively what Python 2 does, which is why you don't see this behavior in Python 2.)

Another way is to use the default-argument trick described in the linked question about closures. Change your loop body to:

nums = filter(lambda x, i=i: x == i or x % i , list(nums))

Making i an argument of the lambda "locks in" the value at each loop iteration. This means that the filters will still operate lazily, but each one will store the proper value to filter on, so it will still work even if you call it later (after changing i).

Community
  • 1
  • 1
BrenBarn
  • 242,874
  • 37
  • 412
  • 384