0

I have a dataframe df with entries for every minute. I need to calculate something for every 6 minute window, but with a 3 minute shift.

For simplicity reasons lets say I need to calculate the mean. And the data spans over one day. Here is an example dataframe with random values. For reproducibility, I have added a full example.

np.random.seed(42)
index = pd.date_range('2019-01-01 00:00:00','2019-01-02 00:00:00', freq='min')
df = pd.DataFrame(np.round(np.random.rand(len(index))*100), index=index, columns=["counts"])
df
                      counts
2019-01-01 00:00:00     37.0
2019-01-01 00:01:00     95.0
2019-01-01 00:02:00     73.0
2019-01-01 00:03:00     60.0
2019-01-01 00:04:00     16.0
2019-01-01 00:05:00     16.0
2019-01-01 00:06:00     6.0
2019-01-01 00:07:00     87.0
2019-01-01 00:08:00     60.0
2019-01-01 00:09:00     71.0
2019-01-01 00:10:00     2.0
2019-01-01 00:11:00     97.0
2019-01-01 00:12:00     83.0
2019-01-01 00:13:00     21.0
2019-01-01 00:14:00     18.0
2019-01-01 00:15:00     18.0
2019-01-01 00:16:00     30.0
2019-01-01 00:17:00     52.0

When I simply resample I get the results starting at minute zero.

df.resample("6min").mean()
                           counts
2019-01-01 00:00:00     49.500000
2019-01-01 00:06:00     53.833333
2019-01-01 00:12:00     37.000000
....

What I need on top if this are the results starting at minute 3 and again resampling ever y6 minutes, e.g.

df.magicfunction.mean()
                           counts
2019-01-01 00:03:00     40.833333
2019-01-01 00:09:00     48.666666
....

Is there a way to set the starting point of the resampling window?

Alternatively, this is similar to a time-shift window which apparently is not working yet in pandas. Are there alternatives?

ABot
  • 197
  • 12
  • ```loffset``` is only adjusting the resampled time labels, but not the resampling time window. This can be used for assigning another label to the time window, e.g. the middle point. (Or so I thought) – ABot Oct 15 '19 at 14:49
  • Without `seed` is hard to reproduce results. – Quant Christo Oct 15 '19 at 14:55
  • The results are irrelevant. I will not be using the mean() function, I'm just using it for simplicity reasons. I'll adjust my question, though. just a minute. – ABot Oct 15 '19 at 15:10
  • 1
    i think `base` argument is what you are looking for – Benoit de Menthière Oct 15 '19 at 15:10
  • duplicate https://stackoverflow.com/questions/20374736/resample-daily-pandas-timeseries-with-start-at-time-other-than-midnight – Benoit de Menthière Oct 15 '19 at 15:12
  • yes, that's it! Thank you very much @BenoitdeMenthière. I appologize for the duplicate. I was looking for an answer, but not with the right words, apparently. – ABot Oct 15 '19 at 15:18

0 Answers0