3

So I have a custom function that I want to apply to a row of data in a DataFrame, but how do I include the extra parameter that I need I have given an example below

# Using df.apply
df = pd.DataFrame({"A": [1,2,3]})
sum_A = np.sum(df.A)

def calc_weight(row, total):
    row["weights"] = row["A"]/total

df.apply(calc_weight(row, sum_A), axis = 1)
# Gives NameError: name 'row' is not defined

df.apply(calc_weight(row, sum_A), axis = 1)
# TypeError: calc_weight() missing 1 required positional argument: 'total'

The output that I want is something like:

  A weights
0 1  0.166 
1 2  0.333
2 3   0.5

I've looked online but I can't seem to find anything, or do I have to default to using a for loop to do something like this?

YellowPillow
  • 4,100
  • 6
  • 31
  • 57

1 Answers1

4

Try to add argument in apply function as below:

import pandas as pd                                                                                                  
import numpy as np

df = pd.DataFrame({"A": [1,2,3]})                                                                                    
sum_A = np.sum(df.A)                                                                                                 

def f(a, total):
    return float(a)/total                                                                                            

df['weight'] = df['A'].apply(f, args=(sum_A,))                                                                       
print df    

Output:

   A    weight
0  1  0.166667
1  2  0.333333
2  3  0.500000

~

linpingta
  • 2,324
  • 2
  • 18
  • 36