python def pandas apply

Question

Code:

import pandas as pd
football = pd.read_csv("https://www.football-data.co.uk/mmz4281/2021/E0.csv")
Avgfootball = football[['AvgH', 'AvgD', 'AvgA']]

def implieddds(AvgH, AvgD, AvgA):
  impliedH = (1/AvgH)/((1/AvgH) + (1/AvgD) + (1/AvgA))
  impliedD = (1/AvgD)/((1/AvgH) + (1/AvgD) + (1/AvgA))
  impliedA = (1/AvgA)/((1/AvgH) + (1/AvgD) + (1/AvgA))
  Mar = ((1/AvgH) + (1/AvgD) + (1/AvgA))

  return impliedH,impliedD,impliedA,Mar

new = Avgfootball.apply(implieddds)

Error:

TypeError: implieddds() missing 2 required positional arguments :
'AvgD' and 'AvgA'

TypeError                                 Traceback(most recent call
last) < ipython - input - 2 - e74b2ed53649 > in <module>()
11   return impliedH, impliedD, impliedA, Mar
12
---> 13 new = Avgfootball.apply(implieddds)

3 frames / usr / local / lib / python3.6 / dist - packages / pandas / core / apply.py
in apply_series_generator(self)
303                 for i, v in enumerate(series_gen) :
304                     # ignore SettingWithCopy here in case the user mutates
-- > 305                     results[i] = self.f(v)
306                     if isinstance(results[i], ABCSeries) :
307                         # If we have a view on v, we need to make a copy because

TypeError: implieddds() missing 2 required positional arguments :
'AvgD' and 'AvgA'

score 0 · Answer 1 · answered Feb 12 '21 at 13:14

You should check the documentation of the apply method (here)

The solution is to use the axis parameter set to 1 so that every row is sent to the function as a pandas.Series

Be careful, your function returns a tuple, in new you will have a pandas.Series with tuples as values.

def implieddds(x):
    AvgH = x['AvgH']
    AvgD = x['AvgD']
    AvgA = x['AvgA']

    impliedH = (1/AvgH)/((1/AvgH) + (1/AvgD) + (1/AvgA))
    impliedD = (1/AvgD)/((1/AvgH) + (1/AvgD) + (1/AvgA))
    impliedA = (1/AvgA)/((1/AvgH) + (1/AvgD) + (1/AvgA))
    Mar = ((1/AvgH) + (1/AvgD) + (1/AvgA))

    return impliedH,impliedD,impliedA,Mar

new = Avgfootball.apply(implieddds, axis=1)

SeaBean · Answer 2 · 2021-02-12T17:25:10.413

As you are feeding column values row-wise to the function implieddds, you can use either apply(... , axis=1), or list(map(...)). For apply() without axis=1 in your existing code, you are iterating over the row-index column by column (column-wise) where you cannot feed the different values of columns 'AvgH', 'AvgD', 'AvgA' into the function (in this case you can only feed row-index for without axis=1).

Solution 1: apply(..., axis=1) [Relatively slow for large amount of data]

new = Avgfootball.apply(lambda x: implieddds(x['AvgH'], x['AvgD'], x['AvgA']), axis=1)

Using the lambda function, you can then call your function in the fashion just like a general Python function call. You can keep your implieddds function definition without modification by this way so that it can be used for both with pandas or as a generic function for Python.

Solution 2: list(map(...)) [Faster execution and good for both large and small amount of data]

new = list(map(implieddds, Avgfootball['AvgH'], Avgfootball['AvgD'], Avgfootball['AvgA']))

list(map(...)) is most often around 3x times faster than the apply(...axis=1) counterpart. [See the link below for execution time comparison].

I would recommend using list(map(...)) instead of apply(..., axis=1) for row-wise operations.

For more explanations on the column-wise vs row-wise DataFrame.apply() function and also the real case execution time comparison, you can refer to this post

python def pandas apply

2 Answers2