14

I have a pandas dataframe 'df' with two columns 'A' and 'B', I have a function with two arguments

def myfunction(B, A):
    # do something here to get the result
    return result

and I would like to apply it row-by-row to df using the 'apply' function

df['C'] = df['B'].apply(myfunction, args=(df['A'],))

but I get the error

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

whats happening here, it seems it takes df['A'] as the whole series! not just the row entry from that series as required.

Runner Bean
  • 4,895
  • 12
  • 40
  • 60

2 Answers2

28

I think you need:

import pandas as pd
df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6]})

print (df)
   A  B
0  1  4
1  2  5
2  3  6

def myfunction(B, A):
    #some staff  
    result = B + A 
    # do something here to get the result
    return result

df['C'] = df.apply(lambda x: myfunction(x.B, x.A), axis=1)
print (df)
   A  B  C
0  1  4  5
1  2  5  7
2  3  6  9

Or:

def myfunction(x):

    result = x.B + x.A
    # do something here to get the result
    return result

df['C'] = df.apply(myfunction, axis=1)
print (df)
   A  B  C
0  1  4  5
1  2  5  7
2  3  6  9
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • why cant I use the 'args' argument for the apply function? – Runner Bean Oct 02 '16 at 06:30
  • @RunnerBean you can pass arguments just fine. `apply` accepts `kwargs` so you can pass arguments like this: `df['B'].apply(myfunction, A=df['A'])` But in this case, it's a bad idea as you would be passing an entire series to a function applied at every row. – piRSquared Oct 02 '16 at 07:12
0

I would add one more method, which could be useful when you need pass all columns to the function.

We consider that according to pandas.DataFrame.apply docs: Objects passed to the function are Series objects. So we convert pd.Series to list. Then unpack it using * operator, while our function call .

import pandas as pd
df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6]})

print (df)
   A  B
0  1  4
1  2  5
2  3  6

def myfunction(B, A):
    #some staff  
    result = B + A 
    # do something here to get the result
    return result

df['C'] = df.apply(lambda x: myfunction(*x.to_list()), axis=1)
print (df)
   A  B  C
0  1  4  5
1  2  5  7
2  3  6  9
Egor B Eremeev
  • 982
  • 11
  • 20