Replace NaN with the average of the last 5 values - Pandas

Question

I want to know how can I replace the NaN in my dataset with the last average of 5 last values.

Column A	Column B
1	2
2	5
3	5
4	2
5	2
NaN	2
NaN	2
1	2
1	2
1	2
1	NaN
1	2
1	2

For example, in this case the first NaN will be the average of (1,2,3,4,5) and second NaN will be the average of (2,3,4,5, The value of the other NaN).

I have tried

df.fillna(df.mean())

This seems the same as for: https://stackoverflow.com/questions/49172914/how-to-fill-nan-values-with-rolling-mean-in-pandas — Glauco, Nov 03 '22 at 15:26
@Glauco maybe, but when I use the 'rolling_mean' I have a problem AttributeError: module 'pandas' has no attribute 'rolling_mean' — Vitor Fernandes, Nov 03 '22 at 15:44
I Tried dadosteste = dadosteste.fillna(dadosteste.rolling( 6, min_periods=1).mean()) — Vitor Fernandes, Nov 03 '22 at 15:49
it seems rolling_mean was deprecated, here the new recipe: https://stackoverflow.com/questions/36274447/pd-rolling-mean-becoming-deprecated-alternatives-for-ndarrays — Glauco, Nov 03 '22 at 16:07

score 0 · Accepted Answer · answered Nov 03 '22 at 16:15

0

As mentioned, it has been answered here, but the updated version for the latest pandas version is as follow:


data={'col1':[1,2,3,4,5,np.nan,np.nan,1,1,1,1,1,1],
      'col2':[2,5,5,2,2,2,2,2,2,2,np.nan,2,2]}
df=pd.DataFrame(data)

window_size = 5
df=df.fillna(df.rolling(window_size+1, min_periods=1).mean())

outputs:

    col1  col2
0    1.0   2.0
1    2.0   5.0
2    3.0   5.0
3    4.0   2.0
4    5.0   2.0
5    3.0   2.0
6    3.5   2.0
7    1.0   2.0
8    1.0   2.0
9    1.0   2.0
10   1.0   2.0
11   1.0   2.0
12   1.0   2.0

answered Nov 03 '22 at 16:15

Mat.B

336
2
8

If the np.nan is on the first row, how can I do? – Vitor Fernandes Nov 03 '22 at 17:06
Well i'm not sure... what would be the strategy in the case? Take the mean of the 5 next rows ? If the dataset is "large", I would handle the first rows "by hand" i guess, i cannot see any general method that would take care of all the cases – Mat.B Nov 03 '22 at 17:32

Replace NaN with the average of the last 5 values - Pandas

1 Answers1