0

So I have a bunch of functions I want to create with different parameters. One of the parameters df will be provided by the caller of these functions. I thought I had it figured out but when I actually used it every function created had the same parameters, the last combination in the list comprehension sequence. weird.

from itertools import product
feature_functions = {
    **{f'{col}{i}': lambda x: createFeature(df=x, i=i, col=col, name=f'{col}{i}')
        for col, i in product(['New', 'Lost', 'Change'], list(range(1, 31)))},

like I said, I thought this was pretty slick but when I used it like so:

feature_functions['New1'](df)

I got this result, meaning it was using the 'Change' and 30 for each lambda function:

# feature pd.Series:
0     NaN
      ...   
4593  1.002706
Name: Change30, Length: 4594, dtype: float64

I tried several things, but nothing changed. How am I using this dictionary comprehension wrong?

EDIT: By the way, one thing that I did to verify it was right, was put the lambda x: ... in quotes, then I could just print it all out and it looked pretty good. so, somehow the lambda is getting in the way? I did try wrapping it in (lambda x: ...) but that did nothing.

{'New1': "lambda x: createFeature(df=x, i=1, col=New, name='New1')",
 'New2': "lambda x: createFeature(df=x, i=2, col=New, name='New2')",
 'New3': "lambda x: createFeature(df=x, i=3, col=New, name='New3')",
 'New4': "lambda x: createFeature(df=x, i=4, col=New, name='New4')",
 ... 
}
martineau
  • 119,623
  • 25
  • 170
  • 301
MetaStack
  • 3,266
  • 4
  • 30
  • 67
  • 3
    I wish I had more time to look deeper into this because it seems interesting. I would guess its something to do with the closure not correctly capturing the values that are changed. Perhaps using functools.partial could be a way to get this to work? – Luke Nelson Dec 03 '21 at 21:03

2 Answers2

2

You are really close. Here, don't use lambda but partial function from functools module:

# dummy function
def createFeature(df, i, col, name):
    print(df)
    print(i, col, name)

feature_functions = {
    **{f'{col}{i}': partial(createFeature, i=i, col=col, name=f'{col}{i}')
        for col, i in product(['New', 'Lost', 'Change'], list(range(1, 31)))}}

Usage:

>>> feature_functions['New1'](pd.DataFrame)
Empty DataFrame
Columns: []
Index: []
1 New New1

>>> feature_functions['Lost23'](pd.DataFrame())
Empty DataFrame
Columns: []
Index: []
23 Lost Lost23

>>> feature_functions['Change12'](pd.DataFrame())
Empty DataFrame
Columns: []
Index: []
12 Change Change12
Corralien
  • 109,409
  • 8
  • 28
  • 52
2

Ok, this is very interesting. If you create a function that returns your lambda, such as:

def createFeatureCreator(i, col):
   return lambda x: createFeature(df=x, i=i, col=col, name=f'{col}{i}')

and do your comprehension (I removed a lot of spurious things you had):

feature_functions = {f'{col}{i}': createFeatureCreator(i, col)
        for col, i in product(['New', 'Lost', 'Change'], range(1, 31))}

it works as you would expect.

The reason why the "lambda" construct does not work directly is actually very interesting: a lambda captures the environment. The dict comprehension is a single environment, where the variables i and col change at each iteration of the loop. When the lambda is created (and indeed 93 different lambdas are created), they all capture the same environment, thus when they are executed the values of i and col are the last value that they had in the environment (the f-string expands to a function call, that is not executed because it is inside the lambda, and it is only executed when you actually call the function, that's why name also appears to be "wrong").

nonDucor
  • 2,057
  • 12
  • 17
  • 2
    I prefer this answer because its just native python (no import of partial from functools). Isolate the lambda. I think most people prefer the partial option, arguing it's more 'pythonic' since it was created just for this kind of thing. However, I like yours more because I used to use it all the time, then when working in other languages that didn't have such a concept but did have lambda or inline functions I was not sure what to do at first, since I had used partial as a crutch. – MetaStack Dec 03 '21 at 21:16
  • oh by the way I did find another insane way to do this ```{f'{col}{i}': eval(f"lambda x: createFeature(df=x, i={i}, col='{col}', name='{col}{i}')") for ...}``` – MetaStack Dec 03 '21 at 21:20
  • 1
    @LegitStack That works as well, since nothing is used from the comprehension environment by the lambda, but I would avoid as much as possible the use of evals. They tend to be dangerous, besides having bad performance (every iteration of the loop the eval would be parsed). – nonDucor Dec 03 '21 at 21:25