1

Given df

df = pd.DataFrame({'distance': [0,1,2,np.nan,3,4,5,np.nan,np.nan,6]})

   distance
0       0.0
1       1.0
2       2.0
3       NaN
4       3.0
5       4.0
6       5.0
7       NaN
8       NaN
9       6.0

I want to replace the nans with the inbetween mean

Expected output:

   distance
0       0.0
1       1.0
2       2.0
3       2.5
4       3.0
5       4.0
6       5.0
7       5.5
8       5.5
9       6.0

I have seen this_answer but it's for a grouping which isn't my case and I couldn't find anything else.

Kenan
  • 13,156
  • 8
  • 43
  • 50

2 Answers2

2

If you don't want df.interpolate you can compute the mean of the surrounding values manually with df.bfill and df.ffill

(df.ffill() + df.bfill()) / 2

Out:

   distance
0       0.0
1       1.0
2       2.0
3       2.5
4       3.0
5       4.0
6       5.0
7       5.5
8       5.5
9       6.0
Michael Szczesny
  • 4,911
  • 5
  • 15
  • 32
1

How about using linear interpolation?

print(df.distance.interpolate())

0    0.000000
1    1.000000
2    2.000000
3    2.500000
4    3.000000
5    4.000000
6    5.000000
7    5.333333
8    5.666667
9    6.000000
Name: distance, dtype: float64
robertwest
  • 904
  • 7
  • 13