1

I've dataframe df from excel

Is this possible in any way:

df["A"] = [foo(b, c) for (b, c) in (df["B"], df["C"])]

need to pass variables in function from different columns of dataframe

thx

  • Use two for clauses in your list comprehension? - [multiple variables in list comprehension?](https://stackoverflow.com/a/40351298/5821790) – Vepir May 16 '21 at 15:37
  • 2
    `df["A"] = [foo(b, c) for b, c in zip(df["B"], df["C"])]` – DarrylG May 16 '21 at 15:42

1 Answers1

2

You can use df.apply() on axis=1 (for column index) to get the corresponding values of df["B"] and df["C"]) for each row for passing to foo, as folllow:

df['A'] = df.apply(lambda x: foo(x['B'], x['C']), axis=1)

This is the more idiomatic and Pandas way of achieving the task. We commonly prefer to use Pandas functions than using list comprehension since Pandas functions can handle NaN values better while list comprehension often gives you error when handling NaN values.

SeaBean
  • 22,547
  • 3
  • 13
  • 25
  • thx, bro, incredible workable), ps: just edit lambda x: – Константин Прудников May 16 '21 at 16:13
  • @КонстантинПрудников Oh, corrected the typo now, thanks! – SeaBean May 16 '21 at 16:45
  • @SeaBean--"We commonly prefer to use df.apply() than using list comprehension"--not sure if this is correct when you consider that apply is one of the slower methods in comparison to list comprehension & zip as demonstrated by [How to iterate over rows in a DataFrame in Pandas](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas/55557758#55557758) – DarrylG May 16 '21 at 16:47
  • Oh, I was probably too sleepy when answering this question so that I got multiple typos. As you can see in my argument was that Pandas function can handle `NaN` values better. We often see people posting questions about `TypeError: 'float' object is not subscriptable` which were actually caused by `NaN` values not properly handled when working on a dataframe with `NaN` values. – SeaBean May 16 '21 at 17:34
  • For OP's question, it essentially is to want to achieve the same target as [How to apply a function to two columns of Pandas dataframe](https://stackoverflow.com/questions/13331698/how-to-apply-a-function-to-two-columns-of-pandas-dataframe/) but asked in a manner of using list comprehension that was trying to access the row values of 2 columns in an incorrect way. In order to provide a quick guide, I didn't go into details with comparing various ways for system performance. – SeaBean May 16 '21 at 17:35