I have need to apply a function to each column in a Pandas dataframe that includes a count of NaN in each column. Say that I have this dataframe:
import pandas as pd
df = pd.DataFrame({'Baseball': [3, 1, 2], 'Soccer': [1, 6, 7], 'Rugby': [8, 7, None]})
Baseball Soccer Rugby
0 3 1 8.0
1 1 6 7.0
2 2 7 NaN
I can get the count of NaN in each column with:
df.isnull().sum()
Baseball 0
Soccer 0
Rugby 1
But I can't figure out how to use that result in a function to apply to each column. Say just as an example, I want to add the number of NaN in a column to each element in that column to get:
Baseball Soccer Rugby
0 3 1 9.0
1 1 6 8.0
2 2 7 NaN
(My actual function is more complex.) I tried:
def f(x, y):
return x + y
df2 = df.apply(lambda x: f(x, df.isnull().sum()))
and I get the thoroughly mangled:
Baseball Soccer Rugby
0 NaN NaN NaN
1 NaN NaN NaN
2 NaN NaN NaN
Baseball NaN NaN NaN
Rugby NaN NaN NaN
Soccer NaN NaN NaN
Any idea how to use the count of NaN in each column in a function applied to each column?
Thanks in advance!