How to enter parameters into a function when using pandas apply

Question

First time posting here - have decided to try and learn how to use python whilst on Covid-19 forced holidays.

I'm trying to summarise some data from a pretty simple database and have been using the value_counts function.

Rather than running it on every column individually, I'd like to loop it over each one and return a summary table. I can do this using df.apply(pd.value_counts) but can't work out how to enter parameters into the the value counts as I want to have dropna = False.

Basic example of data I have:

# Import libraries 
import pandas as pd 
import numpy as np

# create list of winners and runnerup
data = [['john', 'barry'], ['john','barry'], [np.nan,'barry'], ['barry','john'],['john',np.nan],['linda','frank']] 

# Create the pandas DataFrame 
df = pd.DataFrame(data, columns = ['winner', 'runnerup']) 

# print dataframe. 
df

How I was doing the value counts for each column:

#Who won the most?
df['winner'].value_counts(dropna=False)

Output:
john     3
linda    1
barry    1
NaN      1
Name: winner, dtype: int64

How can I enter the dropna=False when using apply function? I like the table it outputs below but want the NaN to appear in the list.

#value counts table
df.apply(pd.value_counts)
      winner    runnerup
barry   1.0       3.0
frank   NaN       1.0
john    3.0       1.0
linda   1.0       NaN

#value that is missing from list
#NaN    1.0       1.0

Any help would be appreciated!!

Does this answer your question? [python pandas: apply a function with arguments to a series](https://stackoverflow.com/questions/12182744/python-pandas-apply-a-function-with-arguments-to-a-series) — wwii, Apr 03 '20 at 23:04

score 0 · Accepted Answer · edited Apr 03 '20 at 23:50

0

You can use df.apply, like this:

df.apply(pd.value_counts, dropna=False)

edited Apr 03 '20 at 23:50

Joe Mayo

7,501
7
41
60

answered Apr 03 '20 at 23:06

thanks! that worked for me. I was trying to put the 'dropna' inside the value counts. e.g. df.apply(pd.value_counts(dropna=False)). – MichaelH Apr 07 '20 at 10:40

score 0 · Answer 2 · answered Apr 03 '20 at 23:35

0

In pandas apply function, if there is a single parameter, you simply do:

.apply(func_name)

The parameter is the value of the cell. This works exactly the same way for pandas build in function as well as user defined functions (UDF).

for UDF, when there are more than one parameters:

.apply(func_name, args=(arg1, arg2, arg3, ...))

See: this link

answered Apr 03 '20 at 23:35

zafrin

434
4
11

Thanks for the explanation, appreciate your quick response. I didn't quite understand the documentation the first time I reviewed it, I was thinking I had to put the 'dropna=False' as an 'arg' but now I understand (I think) that the dropna is a parameter and the args are related to positions. Will have to do some more reading! – MichaelH Apr 07 '20 at 10:15
Glad it helped. If you have not done much object oriented programming (OOP). I would encourage you to look at at least some basics. This will make you more comfortable with passing self as an argument. – zafrin Apr 07 '20 at 14:54

How to enter parameters into a function when using pandas apply

2 Answers2