16

I am struggling to round off timestamps using pandas.

The timestamps look like this:

datetime.datetime(2017,06,25,00,31,53,993000)
datetime.datetime(2017,06,25,00,32,31,224000)
datetime.datetime(2017,06,25,00,33,11,223000)
datetime.datetime(2017,06,25,00,33,53,876000)
datetime.datetime(2017,06,25,00,34,31,219000)
datetime.datetime(2017,06,25,00,35,12,634000)

How do I round off to the nearest second?

Previously iv tried suggestions in this post but they didn't work: Rounding time off to the nearest second - Python

My code looks like this so far:

import pandas as pd
filename = 'data.csv'
readcsv = pd.read_csv(filename)

Import data according to file header info

log_date = readcsv.date
log_time = readcsv.time
log_lon = readcsv.lon
log_lat = readcsv.lat
log_heading = readcsv.heading

readcsv['date'] = pd.to_datetime(readcsv['date']).dt.date
readcsv['time'] = pd.to_datetime(readcsv['time']).dt.time

Combine date and time into one variable

timestamp = [datetime.datetime.combine(log_date[i],log_time[i]) for i in range(len(log_date))]

Create dataframe

data = {'timestamp':timestamp,'log_lon':log_lon,'log_lat':log_lat,'log_heading':log_heading}
log_data = pd.DataFrame(data,columns=['timestamp','log_lon','log_lat','log_heading'])
log_data.index = log_data['timestamp']

I am still quite new to python so please excuse my ignorance

Jetman
  • 765
  • 4
  • 14
  • 30

2 Answers2

12

You can use first read_csv with parameter parse_dates for create datetimes from column date and time and then dt.round for round datetimes:

import pandas as pd

temp=u"""date,time,lon,lat,heading
2017-06-25,00:31:53.993000,48.1254,17.1458,a
2017-06-25,00:32:31.224000,48.1254,17.1458,a
2017-06-25,00:33:11.223000,48.1254,17.1458,a
2017-06-25,00:33:53.876000,48.1254,17.1458,a
2017-06-25,00:34:31.219000,48.1254,17.1458,a
2017-06-25,00:35:12.634000,48.1254,17.1458,a"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), parse_dates={'timestamp':['date','time']})

print (df)
                timestamp      lon      lat heading
0 2017-06-25 00:31:53.993  48.1254  17.1458       a
1 2017-06-25 00:32:31.224  48.1254  17.1458       a
2 2017-06-25 00:33:11.223  48.1254  17.1458       a
3 2017-06-25 00:33:53.876  48.1254  17.1458       a
4 2017-06-25 00:34:31.219  48.1254  17.1458       a
5 2017-06-25 00:35:12.634  48.1254  17.1458       a

print (df.dtypes)
timestamp    datetime64[ns]
lon                 float64
lat                 float64
heading              object
dtype: object

df['timestamp'] = df['timestamp'].dt.round('1s')

print (df)
            timestamp      lon      lat heading
0 2017-06-25 00:31:54  48.1254  17.1458       a
1 2017-06-25 00:32:31  48.1254  17.1458       a
2 2017-06-25 00:33:11  48.1254  17.1458       a
3 2017-06-25 00:33:54  48.1254  17.1458       a
4 2017-06-25 00:34:31  48.1254  17.1458       a
5 2017-06-25 00:35:13  48.1254  17.1458       a

EDIT:

If you want set column with datetimes to index also:

import pandas as pd

temp=u"""date,time,lon,lat,heading
2017-06-25,00:31:53.993000,48.1254,17.1458,a
2017-06-25,00:32:31.224000,48.1254,17.1458,a
2017-06-25,00:33:11.223000,48.1254,17.1458,a
2017-06-25,00:33:53.876000,48.1254,17.1458,a
2017-06-25,00:34:31.219000,48.1254,17.1458,a
2017-06-25,00:35:12.634000,48.1254,17.1458,a"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), parse_dates={'timestamp':['date','time']}, index_col=['timestamp'])
print (df)
                             lon      lat heading
timestamp                                        
2017-06-25 00:31:53.993  48.1254  17.1458       a
2017-06-25 00:32:31.224  48.1254  17.1458       a
2017-06-25 00:33:11.223  48.1254  17.1458       a
2017-06-25 00:33:53.876  48.1254  17.1458       a
2017-06-25 00:34:31.219  48.1254  17.1458       a
2017-06-25 00:35:12.634  48.1254  17.1458       a

print (df.index)
DatetimeIndex(['2017-06-25 00:31:53.993000', '2017-06-25 00:32:31.224000',
               '2017-06-25 00:33:11.223000', '2017-06-25 00:33:53.876000',
               '2017-06-25 00:34:31.219000', '2017-06-25 00:35:12.634000'],
              dtype='datetime64[ns]', name='timestamp', freq=None)


df.index = df.index.round('1s')
print (df)
                         lon      lat heading
timestamp                                    
2017-06-25 00:31:54  48.1254  17.1458       a
2017-06-25 00:32:31  48.1254  17.1458       a
2017-06-25 00:33:11  48.1254  17.1458       a
2017-06-25 00:33:54  48.1254  17.1458       a
2017-06-25 00:34:31  48.1254  17.1458       a
2017-06-25 00:35:13  48.1254  17.1458       a
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • i'm getting an "AttributeError: 'DatetimeProperties' object has no attribute 'round'" – Jetman Dec 21 '17 at 07:17
  • 2 questions - what is `print (df.dtypes)` ? and what is your pandas version `print (pd.show_versions())` ? – jezrael Dec 21 '17 at 07:18
  • `df.types` is exactly the same as what you had and Iv got pandas: 0.17.1 – Jetman Dec 21 '17 at 07:21
  • Hmmm, it is a bit old, is possible upgrade? – jezrael Dec 21 '17 at 07:22
  • Because this function is implemented in `pandas 0.18` version (check [this](http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#datetimelike-rounding)) and last version of pandas is `0.21.1` – jezrael Dec 21 '17 at 07:25
  • I see, thats probably been the issue all along, i'll try upgrade now. Thanks very much! – Jetman Dec 21 '17 at 07:25
  • Well that's embarrassing but its finally Sorted thanks very much!! – Jetman Dec 21 '17 at 19:03
4

dt.round is what you are looking for. I'll just create a smaller version of your DataFrame, please comment if you can't modify it to fully fit your case, I can help with that also.

import datetime
import pandas as pd

ts1 = datetime.datetime(2017,06,25,00,31,53,993000)
ts2 = datetime.datetime(2017,06,25,00,32,31,224000)
ts3 = datetime.datetime(2017,06,25,00,33,11,223000)
df = pd.DataFrame({'timestamp':[ts1, ts2, ts3]})

df.timestamp.dt.round('1s')

Gives you the following:

Out[89]: 
0   2017-06-25 00:31:54
1   2017-06-25 00:32:31
2   2017-06-25 00:33:11
Name: timestamp, dtype: datetime64[ns]
FatihAkici
  • 4,679
  • 2
  • 31
  • 48