1

I tried to read many articles but I still do not understand clearly applying lambda in Pandas.

For example, I have a df as follow and I want to apply min function to find the minimum value of each row.

a={'a':[1,2,3,-1],'b':[3,4,0,-2],'c':[0,5,100,10]}
df=pd.DataFrame(a)
b=df.apply(lambda x:min(x['a'],x['b'],x['c']),axis=1)

The above works. If I use : b=df.apply(min(df['a'],df['b'],df['c']),axis=1) , it does not work. I greatly appreciate your kind explanations. Thanks.

Ersoy
  • 8,816
  • 6
  • 34
  • 48
Dinh Quang Tuan
  • 468
  • 3
  • 10
  • 2
    `apply` with `axis=1` takes the row as input. Inside the apply parentheses, you only need to pass the name of the function. what you're passing is the function with call, which is not the same as the function, but its return value. Just in the same way as as`min` is not equal to `min(a,b)`. To get the expected result, you should define a function, named say min_df, like `def min_df(df): return min(df['a'],df['b'],df['c'])`. You can now pass min_df in your apply like `b=df.apply(min_df,axis=1)` to summarize, you pass the function, not call it wihin the lambda – Yati Raj Jun 26 '20 at 10:22
  • I only want to apply function to certain columns, not all columns. And, I want to understand why b=df.apply(lambda x:min(x['a'],x['b'],x['c']),axis=1) works well in this case. – Dinh Quang Tuan Jun 26 '20 at 10:28
  • 2
    sure, that is straight forward. you pass the df in function, and the return only calls certain column. in your example `def min_df(df): return min(df['a'],df['b'])` would take the minimum of only columns `a` and `b`, and leave `c` untouched. you don't need to specify it during invoking `apply` . `b=df.apply(min_df,axis=1)` will now return the `min` of columns `a` and `b` only. basically, only the columns that you refer to in your function definition will be considered when doing apply – Yati Raj Jun 26 '20 at 10:32
  • 2
    the lambda function works in this case as the expression `lambda x:min(x['a'],x['b'],x['c'])` returns a `function` object. `min(df['a'],df['b'],df['c'])` , on the other hand, returns whatever is the minimum vallue of the 3 is, i.e the return type of `min function`. Apply expects the first argument to be a function, not a function followed by () and arguments, which gives you whatever the function returns. here you have to make a distinction b/w a function and what it returns. e.g. `sin` is a function, `sin(pi/2)` is a number – Yati Raj Jun 26 '20 at 10:36
  • Now I understand, thanks so much for your help. – Dinh Quang Tuan Jun 27 '20 at 00:36
  • Does this answer your question? [Trying to understand .apply() in Pandas](https://stackoverflow.com/questions/57527661/trying-to-understand-apply-in-pandas) & [understanding lambda functions in pandas](https://stackoverflow.com/questions/49069624/understanding-lambda-functions-in-pandas) – Trenton McKinney Jun 29 '20 at 22:50

0 Answers0