18

I have a df with the usual timestamps as an index:

    2011-04-01 09:30:00
    2011-04-01 09:30:10
    ...
    2011-04-01 09:36:20
    ...
    2011-04-01 09:37:30

How can I create a column to this dataframe with the same timestamp but rounded to the nearest 5th minute interval? Like this:

    index                 new_col
    2011-04-01 09:30:00   2011-04-01 09:35:00        
    2011-04-01 09:30:10   2011-04-01 09:35:00
    2011-04-01 09:36:20   2011-04-01 09:40:00
    2011-04-01 09:37:30   2011-04-01 09:40:00
Plug4
  • 3,838
  • 9
  • 51
  • 79
  • 1
    Many of these answers are the more consise and you need only change `5mins` in place of `15min` https://stackoverflow.com/questions/32344533/how-do-i-round-datetime-column-to-nearest-quarter-hour – Little Bobby Tables Jun 26 '17 at 15:30

4 Answers4

22

One could easily use the round function of pandas

df["timestamp_column"].dt.round("5min")

Check here for more details

Guido
  • 6,182
  • 1
  • 29
  • 50
20

The round_to_5min(t) solution using timedelta arithmetic is correct but complicated and very slow. Instead make use of the nice Timstamp in pandas:

import numpy as np
import pandas as pd

ns5min=5*60*1000000000   # 5 minutes in nanoseconds 
pd.to_datetime(((df.index.astype(np.int64) // ns5min + 1 ) * ns5min))

Let's compare the speed:

rng = pd.date_range('1/1/2014', '1/2/2014', freq='S')

print len(rng)
# 86401

# ipython %timeit 
%timeit pd.to_datetime(((rng.astype(np.int64) // ns5min + 1 ) * ns5min))
# 1000 loops, best of 3: 1.01 ms per loop

%timeit rng.map(round_to_5min)
# 1 loops, best of 3: 1.03 s per loop

Just about 1000 times faster!

Community
  • 1
  • 1
cronos
  • 2,268
  • 16
  • 17
  • 1
    Be careful when you are dealing with daylight saving time: not all days have the same length, and your approach may fail. I recommend using offsets: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#dateoffset-objects – Marius Ion Oct 07 '15 at 13:57
  • 2
    @Marius Ion: No need to be careful here. The length of the day is irrelevant. If the DatetimeIndex has a timezone, the `astype` operation converts to UTC. Just convert the result back to your timezone then (e.g. with `tz_localize("UTC").tz_convert("US/Eastern")`) – cronos Oct 08 '15 at 13:56
  • Note that to round to the nearest 5min instead of rounding up, it is slightly more complicated - maybe worth adding it to your answer ? I paste my code here but obviously it will look nicer in the answer as comments dont have formatting :( ****** pd.to_datetime( ((rng.astype(np.int64) // ns5min) * ns5min) # round down + (((rng.astype(np.int64) % ns5min) > (ns5min / 2)) * ns5min) # add 5min if needed ) – smarie Dec 14 '20 at 09:47
6

You can try something like this:

def round_to_5min(t):
    delta = datetime.timedelta(minutes=t.minute%5, 
                               seconds=t.second, 
                               microseconds=t.microsecond)
    t -= delta
    if delta > datetime.timedelta(0):
        t += datetime.timedelta(minutes=5)
    return t

df['new_col'] = df.index.map(round_to_5min)
dustyrockpyle
  • 3,184
  • 17
  • 12
  • that looks good but when I test it on my data I see: `2011-04-01 09:32:10 -> 2011-04-01 09:30:00` `2011-04-01 09:32:20 -> 2011-04-01 09:30:00` `2011-04-01 09:32:30 -> 2011-04-01 09:35:00` `2011-04-01 09:32:40 -> 2011-04-01 09:35:00` they should all be 2011-04-01 09:35:00 – Plug4 Jun 29 '14 at 20:17
  • 1
    Ah, in your question you mentioned round to the 'nearest' 5 minutes, but I see from your data you want to round up to the next 5 minutes. I've fixed the answer, thanks. I'm assuming you want 9:30:00:00 to round to 9:30:00:00 rather than 9:35:00:00, otherwise you can just always add the 5 minutes. – dustyrockpyle Jun 29 '14 at 20:27
  • would you how I could select the hours and minutes from my new_col? For instance I am trying to do this: `datetime.datetime.hour(df.new_col)` so that I can have a new colum just with the hour and that's not working – Plug4 Jun 29 '14 at 21:09
  • 1
    Same sort of operation: `df['hour'] = df.new_col.map(lambda t: t.hour)` – dustyrockpyle Jun 29 '14 at 21:12
  • Great thanks! I really need to learn all the possibilities with map – Plug4 Jun 29 '14 at 21:14
2

I had the same problem but with datetime64p[ns] timestamps.

I used:

def round_to_5min(t):
    """ This function rounds a timedelta timestamp to the nearest 5-min mark"""
    t = datetime.datetime(t.year, t.month, t.day, t.hour, t.minute - t.minute%5, 0)  
    return t

followed by the the 'map' function

ShaharA
  • 860
  • 8
  • 19