I'd like to create a dictionary that contains lambda functions for the convenient filtering of pandas data frames. When I instantiate each dictionary item line by line, I get the behaviour I want. But when I use a for loop, the filters use the last value of n. Does the lambda function reference the global variable n, and not its value at the time of instantiation? Is my understanding of lambda functions off?
Note, this example is watered down. In my actual project, I use a DateTime index, and the dictionary will have integer keys that filter by year, eg. df.index.year == 2020
, and some string keys that filter by week/weekend, time of day, etc.
import pandas as pd
data = [[1,2],[3,4],[5,6]] # example df
df = pd.DataFrame(index=range(len(data)), data=data)
filts = {}
filts[1] = lambda df: df[df.index == 1] # making a filter dictionary
filts[2] = lambda df: df[df.index == 2] # of lamda funcs
print(filts[1](df)) # works as expected
print(filts[2](df))
filts = {}
for n in range(len(data)):
filts[n] = lambda df: df[df.index == n] # also tried wrapping n in int
# n = 0 # changes behaviour
print(filts[0](df)) # print out the results for n = 2
print(filts[1](df)) # same problem as above
# futher investigating lambdas
filts = {}
n = 0
filts[n] = lambda df: df[df.index == n] # making a filter dictionary
n = 1
filts[n] = lambda df: df[df.index == n] # of lamda funcs
print(filts[0](df)) # print out the results for n = 1