3

I can't seem to apply to_datetime to a pandas dataframe column, although I've done it dozens of times in the past. The following code tells me that any random value in the "Date Time" column is a string, after I try to convert it to a timestamp. The 'errors=coerce' should convert any parsing errors to 'NaT', but instead I still have '2015-10-10 12:31:04' as a string.

import pandas as pd
df=pd.read_csv(...)
df["Date Time"]=pd.to_datetime(df["Date Time"],errors="coerce")
print str(type(df["Date Time"][9]))+" 1"##########

Why would pandas not raise an error, or not convert parsing errors to 'NaT'?

Here are a few rows of the csv. The real file has a million rows coming from different sources, so it is possible that date formatting is not uniform, however in that case I would expect datetime to return 'NaT' or raise an error, depending on the error argument.

Accuracy,Activity,Altitude,Bearing,Date Time,Date(GMT),Description,Distance,Latitude,Longitude,Name,Speed,_FileNames,datenum
,,null,,,,,,sj,,,,C:/Users/Alexis/Dropbox/Location/Path Tracking Lite/aacy.csv,17054710926
0.0,,0.0,0.0,,,,0.00292115,50.67713796,4.61960233,,4.5,C:/Users/Alexis/Dropbox/Location/Path Tracking Lite/aars.csv,17054710926
0.0,,0.0,0.0,2015-01-31 15:10:,,,0.00404488,39.91572515,116.43714731,,5.4,C:/Users/Alexis/Dropbox/Location/Path Tracking Lite/abch.csv,17054710926
0.0,Walk/Run,0.0,0.0,2015-01-11 10:36:22,,,0,39.94002308,116.43548671,tfdeddd,0.0,C:/Users/Alexis/Dropbox/Location/Path Tracking Lite/abbj.csv,20150111
0.0,Walk/Run,0.0,0.0,2015-01-11 10:36:24,,,0.00968132,39.93998097,116.43558673,,2.7,C:/Users/Alexis/Dropbox/Location/Path Tracking Lite/abbj.csv,20150111
0.0,Walk/Run,0.0,0.0,2015-01-11 10:36:26,,,0.00768588,39.94003147,116.43552386,,4.5,C:/Users/Alexis/Dropbox/Location/Path Tracking Lite/abbj.csv,20150111
0.0,Walk/Run,0.0,0.0,2015-01-11 10:36:28,,,0.00239565,39.94007265,116.43551403,,3.6,C:/Users/Alexis/Dropbox/Location/Path Tracking Lite/abbj.csv,20150111
Alexis Eggermont
  • 7,665
  • 24
  • 60
  • 93
  • 1
    Can you give a few rows example csv? – Andy Hayden Dec 20 '15 at 01:25
  • `pd.to_datetime(df['Date Time'])` works fine for me, producing `NaT` for the first two rows and `dtype: datetime64[ns]`. – Stefan Dec 20 '15 at 01:37
  • Yep, perhaps this is fixed in pandas 0.17.1. I'm assuming you can see this with this 5 line example? Which pandas version are you using. – Andy Hayden Dec 20 '15 at 01:58
  • @AndyHayden 0.16.2. I have fixed the problem by removing some dates that weren't formatted properly, but it still looks like a bug to me since those lines should have given 'NaT'. I'll check if it's solved in 0.17.1. – Alexis Eggermont Dec 20 '15 at 02:11

0 Answers0