How to fill missing value in a few columns at the same time

Question

I need to drop missing values in a few columns. I wrote this to do it one by one:

df2['A'].fillna(df1['A'].mean(), inplace=True)
df2['B'].fillna(df1['B'].mean(), inplace=True)
df2['C'].fillna(df1['C'].mean(), inplace=True)

Any other ways I can fill them all in one line of code?

Does [this](https://stackoverflow.com/questions/18689823/pandas-dataframe-replace-nan-values-with-average-of-columns) answer your question? — Sreyas, Feb 18 '23 at 23:35

score 1 · Accepted Answer · answered Feb 18 '23 at 23:35

You can use a single instructions:

cols = ['A', 'B', 'C']
df[cols] = df[cols].fillna(df[cols].mean())

Or for apply on all numeric columns, use select_dtypes:

cols = df.select_dtypes('number').columns
df[cols] = df[cols].fillna(df[cols].mean())

Note: I strongly discourage you to use inplace parameter. It will probably disappear in Pandas 2

score 0 · Answer 2 · answered Feb 18 '23 at 23:47

0

[lambda c: df2[c].fillna(df1[c].mean(), inplace=True) for c in df2.columns]

answered Feb 18 '23 at 23:47

Laurent B.

1,653
1
7
16

Pedro Rocha · Answer 3 · 2023-02-18T23:56:54.037

There are few options to work with nans in a df. I'll explain some of them...

Given this example df:

	A	B	C
0	1	5	10
1	2	nan	11
2	nan	nan	12
3	4	8	nan
4	nan	9	14

Example 1: fill all columns with mean

df = df.fillna(df.mean())

Result:

	A	B	C
0	1	5	10
1	2	7.33333	11
2	2.33333	7.33333	12
3	4	8	11.75
4	2.33333	9	14

Example 2: fill some columns with median

df[["A","B"]] = df[["A","B"]].fillna(df.median())

Result:

	A	B	C
0	1	5	10
1	2	8	11
2	2	8	12
3	4	8	nan
4	2	9	14

Example 3: fill all columns using ffill()

Explanation: Missing values are replaced with the most recent available value in the same column. So, the value of the preceding row in the same column is used to fill in the blanks.

df = df.fillna(method='ffill')

Result:

	A	B	C
0	1	5	10
1	2	8	11
2	2	8	12
3	4	8	12
4	2	9	14

Example 4: fill all columns using bfill()

Explanation: Missing values in a column are filled using the value of the next row going up, meaning the values are filled from the bottom to the top. Basically, you're replacing the missing values with the next known non-missing value.

df = df.fillna(method='bfill')

Result:

	A	B	C
0	1	5	10
1	2	8	11
2	4	8	12
3	4	8	14
4	nan	9	14

If you wanted to DROP (no fill) the missing values. You can do this:

Option 1: remove rows with one or more missing values

df = df.dropna(how="any")

Result:

	A	B	C
0	1	5	10

Option 2: remove rows with all missing values

df = df.dropna(how="all")

How to fill missing value in a few columns at the same time

3 Answers3