0

I'm trying to create a new column that comes from the calculation of two columns. Usually when I need to do this but with only one column I use .apply() but now with two parameters I don't know how to do it.

With one I do the following code:

from pandas import read_csv, DataFrame

df = read_csv('results.csv')

def myFunc(x):
  x = x + 5
  return x

df['new'] = df['colA'].apply(myFunc)

df.head()

With two I thought was like the following, but not.

from pandas import read_csv, DataFrame

df = read_csv('results.csv')

def myFunc(x,y):
  x = x + y
  return x

df['new'] = df[['colA','colB']].apply(myFunc)

df.head()

I see some people use lambda but I don't understand and furthermore I think has to be easier.

Thank you very much!

Lleims
  • 1,275
  • 12
  • 39
  • Does this answer your question? [How to apply a function to two columns of Pandas dataframe](https://stackoverflow.com/questions/13331698/how-to-apply-a-function-to-two-columns-of-pandas-dataframe) – Mayank Porwal Nov 15 '20 at 17:07

3 Answers3

2

Disclaimer: avoid apply if possible. With that in mind, you are looking for axis=1, but you need to rewrite the function like:

df['new'] = df.apply(lambda x: myFunc(x['colA'], x['colB']), 
                     axis=1)

which is essentially equivalent to:

df['new'] = [myFunc(x,y) for x,y in zip(df['colA'], df['colB'])]
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
1

You can use axis=1 and in function access columns like below

def myFunc(x):
    x['colA']
    x['colB']

and you apply it as

 df['new'] = df.apply(myFunc, axis=1)
A.B
  • 20,110
  • 3
  • 37
  • 71
1

Get knowledge of using lambda from here

lambda function is an expression https://realpython.com/python-lambda/

The special syntax *args in function definitions in python is used to pass a variable number of arguments to a function

https://www.geeksforgeeks.org/args-kwargs-python/

from pandas import read_csv, DataFrame

df = read_csv('results.csv')

def myFunc(x,y):
  return x + y

df['new'] = df[['colA','colB']].apply(lambda col: myFunc(*col) ,axis=1)

df.head()
Yuvaraja
  • 221
  • 1
  • 6