0

I have a pandas dataframe with a column containing strings like follows

01-May-2012  16:44:55.113
01-Jun-2012  18:49:57.466
01-May-2012  14:64:45.119
01-May-2012  14:23:55.113

and I want to convert it to following format.

2012-05-01 16:44:55.113
2012-06-01 18:49:57.466
2012-05-01 14:64:45.119
2012-05-01 14:23:55.113

I tried using pandas.to_datetime(df['date time']) but I get the error unknown string value. I also tried splitting the string into two columns and convert only the date part but faced similar errors

splash58
  • 26,043
  • 3
  • 22
  • 34
Viky N
  • 13
  • 1
  • 3
  • No that didn't solve my problem. While I tried using that I ended with Blank cells – Viky N Jan 01 '18 at 15:56
  • @AntonvBR - I tried exploring all the related links but couldn't find the answer. If its a dupe, could you link it here – Viky N Jan 01 '18 at 15:58

1 Answers1

6

Possible dupe of this: Convert Pandas Column to DateTime. It is old and the accepted answer specifies the format which is not needed in most cases.

However, have a look at this:

import pandas as pd

data = dict(dates=['01-May-2012  16:44:55.113',
                   '01-Jun-2012  18:49:57.466',
                   '01-May-2012  14:64:45.119',  # <--- Error, minute can't be more than 60
                   '01-May-2012  14:23:55.113'])

df = pd.DataFrame(data)
df.dates = pd.to_datetime(df.dates, errors='coerce')

print(df)

You get this (thanks to @3novak for suggesting the errors='coerce'):

                    dates
0 2012-05-01 16:44:55.113
1 2012-06-01 18:49:57.466
2                     NaT
3 2012-05-01 14:23:55.113
Anton vBR
  • 18,287
  • 5
  • 40
  • 46
  • 1
    If you care to specify the format explicitly, you could use `pd.to_datetime(df['dates'], format='%d-%b-%Y %H:%M:%S.%f')` – 3novak Jan 01 '18 at 16:01
  • @3novak Yes you could, but interestingly here there is error in the data. – Anton vBR Jan 01 '18 at 16:02
  • 1
    Yeah, you could use the `errors='coerce'` option in the `pd.to_datetime()` call as well to get around that issue, but it would return an `NaT` value in those positions. – 3novak Jan 01 '18 at 16:03
  • Apologies, the error was my typo. There is no error in the data. Thats 14:54:45.119 – Viky N Jan 01 '18 at 16:03
  • @VikyN Well in that case. This is a dupe. Simply use pd.to_datetime() works for me. Please update the question with data that doesn't work, if that is the case. – Anton vBR Jan 01 '18 at 16:04
  • Yes you are right, adding the error = 'coerce'. solved my problem, there were few data errors in the dataset. Since it was a huge dataset I couldn't spot it initialy. Thanks for the help – Viky N Jan 01 '18 at 16:19