1

I have a Pandas DataFrame with a column containing some missing data. I want to try an imputation strategy where for a NaN point, I impute the point with a value drawn from a distribution centered at a weighted average of the previous two and next two non-NaN data points. I am not sure how in Pandas to get those values, ideally as an np.ndarray.

For example if I had this DataFrame:

0 12
1 33
2 Nan
3 22
4 Nan
5 7

I would like a function called on row 2 to return [12, 33, 22, 7]. If it were called on row 4 to return [33, 22, 7, None].

This post deals with a similar problem, but not exactly what I am looking for.

taurus
  • 470
  • 7
  • 22

0 Answers0