2

I have been stumped for the past few hours trying to solve the following.

In a large data set I have from an automated system, there is a DATE_TIME value, which for rows at midnight has values that dont have a the full hour like: 12-MAY-2017 0:16:20

When I try convert this to a date (so that its usable for conversions) as follows:

df['DATE_TIME'].astype('datetime64[ns]')

I get the following error:

Error parsing datetime string "12-MAY-2017  0:16:20" at position 3

I tried writing some REGEX to pull out each piece but couldnt get anything working given the hour could be either 1 or two characters respectively. It also doesn't seem like an ideal solution to write regex for each peice.

Any ideas on this?

NickP
  • 1,354
  • 1
  • 21
  • 51

2 Answers2

3

Try to use pandas.to_datetime() method:

df['DATE_TIME'] = pd.to_datetime(df['DATE_TIME'], errors='coerce')

Parameter errors='coerce' will take care of those strings that can't be converted to datatime dtype

MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
3

I think you need pandas.to_datetime only:

df  = pd.DataFrame({'DATE_TIME':['12-MAY-2017 0:16:20','12-MAY-2017 0:16:20']})
print (df)
             DATE_TIME
0  12-MAY-2017 0:16:20
1  12-MAY-2017 0:16:20

df['DATE_TIME'] = pd.to_datetime(df['DATE_TIME'])
print (df)
            DATE_TIME
0 2017-05-12 00:16:20
1 2017-05-12 00:16:20

Convert in numpy by astype seems problematic, because need strings in ISO 8601 date or datetime format:

df['DATE_TIME'].astype('datetime64[ns]')

ValueError: Error parsing datetime string "12-MAY-2017 0:16:20" at position 3

EDIT:

If datetimes are broken (some strings or ints) then use MaxU answer.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252