0

I am trying to fill in the missing values of a time series like the one below. I am using Python3.

Week Rainfall(cm)
1    1 
2    NaN
3    9
4    10
5    11
6    NaN
7    NaN
8    14

I do not want to fill the missing values with the mean. If I were going in by hand and filling in the NaN values, I would probably guess that the rainfall in week 2 would be 5cm and the rainfall in weeks 6 and 7 would be 12cm and 13cm, respectively.

I want to make it so for week 2, the value is the average of week 1 (1cm rainfall) and week 3 (9 cm rainfall). (eg. Week 2 would have 5 cm rainfall).

This gets a little more complicated though...

In weeks 6 and 7, I want to make it so the NaN values get filled with 12 and 13, because if you were to draw a line between week 5 (11 cm rainfall) and week 8 (14 cm rainfall) you would expect the rainfall would be 12cm and 13cm for weeks 6 and 7.

Can anybody think of a way to fill the NaN values in the fashion I described above? I've been googling around for the past few hours on this question and cant seem to find anything.


Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
  • Possible duplicate of [Python linear interpolation of values in dataframe](https://stackoverflow.com/questions/34362629/python-linear-interpolation-of-values-in-dataframe) – Brad Solomon Dec 14 '17 at 16:04

1 Answers1

4

You seem to be referring to the process of linear interpolation. If rf is your DataFrame:

rf.interpolate()

   Week  Rainfall(cm)
0     1           1.0
1     2           5.0
2     3           9.0
3     4          10.0
4     5          11.0
5     6          12.0
6     7          13.0
7     8          14.0
Brad Solomon
  • 38,521
  • 31
  • 149
  • 235