I have a data table where missing values (both single and consecutive) can occur within each group. I would like to fill them in as follows: calculate the average of the 3 values to the left of the first NaN in the sequence, then calculate the average of the 3 values to the right of the last NaN in the sequence, and then interpolate the NaNs between these averages.
+-------+-------+
| group | value |
+-------+-------+
| 1 | 1 |
+-------+-------+
| 1 | 1 |
+-------+-------+
| 1 | 2 |
+-------+-------+
| 1 | 3 |
+-------+-------+
| 1 | 4 |
+-------+-------+
| 1 | NaN |
+-------+-------+
| 1 | NaN |
+-------+-------+
| 1 | 3 |
+-------+-------+
| 1 | 6 |
+-------+-------+
| 1 | 4 |
+-------+-------+
| 1 | 3 |
+-------+-------+
| 1 | NaN |
+-------+-------+
| 2 | NaN |
+-------+-------+
| 2 | NaN |
+-------+-------+
| 2 | 1 |
+-------+-------+
| 2 | 2 |
+-------+-------+
| 2 | 3 |
+-------+-------+
| 2 | 4 |
+-------+-------+
| 2 | NaN |
+-------+-------+
| 2 | NaN |
+-------+-------+
| 2 | NaN |
+-------+-------+
| 2 | 6 |
+-------+-------+
| 2 | 8 |
+-------+-------+
| 2 | 9 |
+-------+-------+
code to reproduce dataframe above
nan = np.nan
d = {'group': {0: 1,
1: 1,
2: 1,
3: 1,
4: 1,
5: 1,
6: 1,
7: 1,
8: 1,
9: 1,
10: 1,
11: 1,
12: 2,
13: 2,
14: 2,
15: 2,
16: 2,
17: 2,
18: 2,
19: 2,
20: 2,
21: 2,
22: 2,
23: 2},
'value': {0: 1.0,
1: 1.0,
2: 2.0,
3: 3.0,
4: 4.0,
5: nan,
6: nan,
7: 3.0,
8: 6.0,
9: 4.0,
10: 3.0,
11: nan,
12: nan,
13: nan,
14: 1.0,
15: 2.0,
16: 3.0,
17: 4.0,
18: nan,
19: nan,
20: nan,
21: 6.0,
22: 8.0,
23: 9.0}}
df = pd.DataFrame(d)
Expected output:
d = {'group': {0: 1,
1: 1,
2: 1,
3: 1,
4: 1,
5: 1,
6: 1,
7: 1,
8: 1,
9: 1,
10: 1,
11: 1,
12: 2,
13: 2,
14: 2,
15: 2,
16: 2,
17: 2,
18: 2,
19: 2,
20: 2,
21: 2,
22: 2,
23: 2},
'value': {0: 1.0,
1: 1.0,
2: 2.0,
3: 3.0,
4: 4.0,
5: 3.44444444,
6: 3.88888889,
7: 3.0,
8: 6.0,
9: 4.0,
10: 3.0,
11: 4.333333,
12: 2.0,
13: 2.0,
14: 1.0,
15: 2.0,
16: 3.0,
17: 4.0,
18: 4.166667,
19: 5.333333,
20: 6.500000,
21: 6.0,
22: 8.0,
23: 9.0}}
Is it possible to do this in pandas, without using a loop?