14

Modified from this example:

import io
import pandas as pd
import matplotlib.pyplot as plt

data = io.StringIO('''\
Values
1992-08-27 07:46:48,1
1992-08-27 08:00:48,2
1992-08-27 08:33:48,4
1992-08-27 08:43:48,3
1992-08-27 08:48:48,1
1992-08-27 08:51:48,5
1992-08-27 08:53:48,4
1992-08-27 08:56:48,2
1992-08-27 09:03:48,1
''')
s = pd.read_csv(data, squeeze=True)
s.index = pd.to_datetime(s.index)

res = s.resample('4s').interpolate('linear')
print(res)
plt.plot(res, '.-')
plt.plot(s, 'o')
plt.grid(True)

It works as expected:

1992-08-27 07:46:48    1.000000
1992-08-27 07:46:52    1.004762
1992-08-27 07:46:56    1.009524
1992-08-27 07:47:00    1.014286
1992-08-27 07:47:04    1.019048
1992-08-27 07:47:08    1.023810
1992-08-27 07:47:12    1.028571
....

interpolated values

but if I change the resample to '5s', it produces only NaNs:

1992-08-27 07:46:45   NaN
1992-08-27 07:46:50   NaN
1992-08-27 07:46:55   NaN
1992-08-27 07:47:00   NaN
1992-08-27 07:47:05   NaN
1992-08-27 07:47:10   NaN
1992-08-27 07:47:15   NaN
....

Why?

endolith
  • 25,479
  • 34
  • 128
  • 192
  • just came across this issue [here](https://stackoverflow.com/q/66967998/10197418) - it gets even more confusing if `resample` leaves you with *some* data left (not all NaN). – FObersteiner Apr 06 '21 at 13:03

2 Answers2

30

Option 1
That's because '4s' aligns perfectly with your existing index. When you resample, you get representation from your old series and are able to interpolate. What you want to do is to create an index that is the union of the old index with a new index. Then interpolate and reindex with a new index.

oidx = s.index
nidx = pd.date_range(oidx.min(), oidx.max(), freq='5s')
res = s.reindex(oidx.union(nidx)).interpolate('index').reindex(nidx)
res.plot(style='.-')
s.plot(style='o')

enter image description here


Option 2A
If you are willing to forgo accuracy, you can ffill with a limit of 1

res = s.resample('5s').ffill(limit=1).interpolate()
res.plot(style='.-')
s.plot(style='o')

enter image description here


Option 2B
Same thing with bfill

res = s.resample('5s').bfill(limit=1).interpolate()
res.plot(style='.-')
s.plot(style='o')

enter image description here


Option 3
Intermediate complexity and accuracy

nidx = pd.date_range(oidx.min(), oidx.max(), freq='5s')
res = s.reindex(nidx, method='nearest', limit=1).interpolate()
res.plot(style='.-')
s.plot(style='o')

enter image description here

piRSquared
  • 285,575
  • 57
  • 475
  • 624
0

For me I had to add astype() to make it work, otherwise it produced Nan values:

oidx = s.index
nidx = pd.date_range(oidx.min(), oidx.max(), freq='2min')
res=s.reindex(oidx.union(nidx)).astype(float).interpolate('index').reindex(nidx)
shmulik90
  • 63
  • 1
  • 6