26

I want to create a DateTimeIndex at 1 minute intervals based on a start and end timestamp (given in microseconds since epoch) with pd_date_range(). To do this, I need to round the starting timestamp up and the ending timestamp down. Here is what I have so far:

import pandas as pd
start = 1406507532491431
end = 1406535228420914

start_ts = pd.to_datetime(start, unit='us') # Timestamp('2014-07-28 00:32:12.491431')
end_ts = pd.to_datetime(end, unit='us') # Timestamp('2014-07-28 08:13:48.420914')

I want to round:

start_ts to Timestamp('2014-07-28 00:32') and

end_ts to Timestamp('2014-07-28 08:14').

How can I do this?

mchangun
  • 9,814
  • 18
  • 71
  • 101

6 Answers6

61

As of version 0.18, Pandas has built-in datetime-like rounding functionality:

start_ts.round('min')  # Timestamp('2014-07-28 00:32:00')
end_ts.round('min')    # Timestamp('2014-07-28 08:14:00')

You can also use .ceil or .floor if you need to force the rounding up or down.


EDIT: The above code works with raw pd.Timestamp, as asked by the OP. In case you are working with a pd.Series, use the dt accessor:

s = pd.Series(pd.to_datetime([1406507532491431000, 1406535228420914000]))
s.dt.round('min')

Output:

0   2014-07-28 00:32:00
1   2014-07-28 08:14:00
dtype: datetime64[ns]
Gustavo Bezerra
  • 9,984
  • 4
  • 40
  • 48
6

Doing this in a simple method is currently an outstanding issue here

In [22]: start = 1406507532491431

In [23]: end = 1406535228420914

[26]: dti = pd.to_datetime([start,end],unit='us')

In [27]: dti
Out[27]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2014-07-28 00:32:12.491431, 2014-07-28 08:13:48.420914]
Length: 2, Freq: None, Timezone: None

In [29]: pd.DatetimeIndex(((dti.asi8/(1e9*60)).round()*1e9*60).astype(np.int64))
Out[29]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2014-07-28 00:32:00, 2014-07-28 08:14:00]
Length: 2, Freq: None, Timezone: None

Nevertheless its quite straightforward.

Pull-requests to implement are welcome.

Jeff
  • 125,376
  • 21
  • 220
  • 187
6

I had a similar problem, wanting to round off to the day. Turns out there's an easy way (it works for Y[ear] M[month] D[ay], h[our], m[inute], s[econd]). Assuming df is a pandas DataFrame with a column 'datecol':

df['datecol'] = df['datecol'].values.astype('<M8[m]')

Will round it off to the m[inute]. Given that I found this question originally, I thought I'd link back the answer I got as it seems relevant,

More efficient way to round to day timestamps using pandas

Community
  • 1
  • 1
user3735204
  • 81
  • 1
  • 8
3

As @user3735204 stated, it is possible to round off a columns with:

df['datecol'] = df['datecol'].astype('datetime64[m]')

where the unit in the square brackets could be:

Y[ear] M[month] D[ay], h[our], m[inute], s[econd]

It is also possible to round to the nearest (reference) by making the column as index and applying the round method (available at pandas 0.19.0):

df.index = pd.to_datetime(df['datecol'])
df.index = df.index.round("S")

Example:

df = pd.DataFrame(data = tmpdata)
df['datecol'] = df['datecol'].astype('datetime64[s]')
print df['datecol']

0   2016-10-05 05:37:42
1   2016-10-05 05:37:43
Name: datecol, dtype: datetime64[ns]

df.index = pd.to_datetime(df['datecol'])
df.index = df.index.round("S")

print df.index

DatetimeIndex(['2016-10-05 05:37:43', '2016-10-05 05:37:43'], dtype='datetime64[ns]', name=u'timestamp', freq=None)
Community
  • 1
  • 1
aitorhh
  • 2,331
  • 1
  • 23
  • 35
1

data.index.round('60S')

just round 60 seconds.

DataYoda
  • 771
  • 5
  • 18
0
import pandas as pd
new_index = pd.date_range(start=start_ts.strftime('%Y-%m-%d %H:%M'), end=end_ts.strftime('%Y-%m-%d %H:%M'), freq='1min')
Kirubaharan J
  • 2,255
  • 16
  • 23