How to use apply for multiple Pandas dataset columns?

Question

I am hardly trying to fill some columns with NaN values, selected from a previous list. The code is going to the else path and never makes the correct modifications...

df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
                    'B': [0.0, np.nan, np.nan, 100],
                    'C': [20, 0.0002, 10000, np.nan],
                    'D': ['D0', 'D1', 'D2', 'D3']},
                   index=[0, 1, 2, 3])

num_cols = ['B', 'C']
fill_mean = lambda col: col.fillna(col.mean()) if col.name in num_cols else col
df2.apply(fill_mean, axis=1)

possible duplicate of https://stackoverflow.com/questions/13331698/how-to-apply-a-function-to-two-columns-of-pandas-dataframe ? — Regressor, Oct 02 '20 at 16:14
df1.fillna(df1.mean()) would fill NA in the columns based on mean of that column so you don't really require LC here — Vaishali, Oct 02 '20 at 16:17

score 1 · Answer 1 · answered Oct 02 '20 at 16:19

You can do this much simpler using

df1.fillna(df1.mean())

This will fill the numeric columns' nas by the column mean:

    A   B   C   D
0   A0  0.0 20.000000   D0
1   A1  50.0    0.000200    D1
2   A2  50.0    10000.000000    D2
3   A3  100.0   3340.000067 D3

score 0 · Answer 2 · answered Oct 02 '20 at 16:27

I am not sure if your desired output it just the mean on all columns (single row). If that is the case, may be the below solution could help.

df = df1.select_dtypes(include='float').mean().to_frame().T
df = pd.concat([df, df.reindex(columns = df1.select_dtypes(exclude='float').columns)], axis=1, sort=False)
print(df)
     B            C   A   D
0  50.0  3340.000067 NaN NaN

How to use apply for multiple Pandas dataset columns?

2 Answers2