I have a dataframe df
with entries for every minute. I need to calculate something for every 6 minute window, but with a 3 minute shift.
For simplicity reasons lets say I need to calculate the mean. And the data spans over one day. Here is an example dataframe with random values. For reproducibility, I have added a full example.
np.random.seed(42)
index = pd.date_range('2019-01-01 00:00:00','2019-01-02 00:00:00', freq='min')
df = pd.DataFrame(np.round(np.random.rand(len(index))*100), index=index, columns=["counts"])
df
counts
2019-01-01 00:00:00 37.0
2019-01-01 00:01:00 95.0
2019-01-01 00:02:00 73.0
2019-01-01 00:03:00 60.0
2019-01-01 00:04:00 16.0
2019-01-01 00:05:00 16.0
2019-01-01 00:06:00 6.0
2019-01-01 00:07:00 87.0
2019-01-01 00:08:00 60.0
2019-01-01 00:09:00 71.0
2019-01-01 00:10:00 2.0
2019-01-01 00:11:00 97.0
2019-01-01 00:12:00 83.0
2019-01-01 00:13:00 21.0
2019-01-01 00:14:00 18.0
2019-01-01 00:15:00 18.0
2019-01-01 00:16:00 30.0
2019-01-01 00:17:00 52.0
When I simply resample I get the results starting at minute zero.
df.resample("6min").mean()
counts
2019-01-01 00:00:00 49.500000
2019-01-01 00:06:00 53.833333
2019-01-01 00:12:00 37.000000
....
What I need on top if this are the results starting at minute 3 and again resampling ever y6 minutes, e.g.
df.magicfunction.mean()
counts
2019-01-01 00:03:00 40.833333
2019-01-01 00:09:00 48.666666
....
Is there a way to set the starting point of the resampling window?
Alternatively, this is similar to a time-shift window which apparently is not working yet in pandas. Are there alternatives?