4

I am trying to apply a function to the rows of a dataframe with the apply args argument. I see multiple similar questions, but following the solutions does not seem to work. I have created a sample example.

Here I divide my dataframe by the sum of its columns

pij=pd.DataFrame(np.random.randn(500,2))
pij.divide(pij.sum(1),axis=0).head() 
          0         1
0  1.077353 -0.690463
1  0.608302  0.583209
2 -0.724272 -1.665318
3 -0.735404 -0.606744
4 -0.033409 -0.162695

I know how to use lambda's to return the same result

def lambda_divide(row):
    return row / row.sum(0)
pij.apply(lambda row: lambda_divide(row), axis=1).head()
          0         1
0  1.077353 -0.690463
1  0.608302  0.583209
2 -0.724272 -1.665318
3 -0.735404 -0.606744
4 -0.033409 -0.162695

However, when I try to use the apply arguments, it does not work

pij.apply(np.divide,args=(pij.sum(1)))
Community
  • 1
  • 1
Bobe Kryant
  • 2,050
  • 4
  • 19
  • 32

1 Answers1

2

The full error suggests this is due to pandas special casing ufuncs:

   4045
   4046         if isinstance(f, np.ufunc):
-> 4047             results = f(self.values)
   4048             return self._constructor(data=results, index=self.index,
   4049                                      columns=self.columns, copy=False)

ValueError: invalid number of arguments

This looks like a bug!


In this specific case you can use div:

In [11]: df.div(df.sum(1), axis=0)
Out[11]:
          0         1
0  2.784649 -1.784649
1  0.510530  0.489470
2  0.303095  0.696905
3  0.547931  0.452069
4  0.170364  0.829636
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535