There's a base
argument for resample
or pd.Grouper
that is meant to handle this situation. There are many ways to accomplish this, pick whichever you feel is more clear.
- '1D' frequency with
base=0.25
- '24h' frequency with
base=6
- '1440min' frequency with
base=360
Code
df = pd.DataFrame({'timestamp': pd.date_range('2010-01-01', freq='10min', periods=200)})
df.resample(on='timestamp', rule='1D', base=0.25).timestamp.agg(['min', 'max'])
#df.resample(on='timestamp', rule='24h', base=6).timestamp.agg(['min', 'max'])
#df.resample(on='timestamp', rule=f'{60*24}min', base=60*6).timestmap.agg(['min', 'max'])
min max
timestamp
2009-12-31 06:00:00 2010-01-01 00:00:00 2010-01-01 05:50:00 #[Dec31 6AM - Jan1 6AM)
2010-01-01 06:00:00 2010-01-01 06:00:00 2010-01-02 05:50:00 #[Jan1 6AM - Jan2 6AM)
2010-01-02 06:00:00 2010-01-02 06:00:00 2010-01-02 09:10:00 #[Jan2 6AM - Jan3 6AM)
For completeness, resample
is a convenience method and is in all ways the same as groupby
. If for some reason you absolutely cannot use resample
you could do:
for dt, gp in df.groupby(pd.Grouper(key='timestamp', freq='24h', base=6)):
...
which is equivalent to
for dt, gp in df.resample(on='timestamp', rule='24h', base=6):
...