0

I have a dataframe that has a date time string but is not in traditional date time format. I would like to separate out the date from the time into two separate columns. And then eventually also separate out the month.

This is what the date/time string looks like: 2019-03-20T16:55:52.981-06:00

>>> df.head()
Date                             Score
2019-03-20T16:55:52.981-06:00    10
2019-03-07T06:16:52.174-07:00    9
2019-06-17T04:32:09.749-06:003   1

I tried this but got a type error:

df['Month'] = pd.DatetimeIndex(df['Date']).month
  • Is your `Date` column a string or a datetime object? – carousallie Aug 30 '19 at 17:55
  • It works for me. Could you post the error please? – Danny Aug 30 '19 at 18:01
  • 1
    Um, that *is* the traditional date time format. It is [the ISO 8601 Extended Date and Time Format](https://en.wikipedia.org/wiki/ISO_8601), also covered by [RFC 3339 - Date and Time on the Internet: Timestamps](https://tools.ietf.org/html/rfc3339). – Matt Johnson-Pint Aug 30 '19 at 18:05
  • That is, except for that extra `3` in the last row. Is that a typo? or does your string actually contain `-06:003`? If so, what does it *mean*? – Matt Johnson-Pint Aug 30 '19 at 18:06
  • See also: https://stackoverflow.com/questions/127803/how-do-i-parse-an-iso-8601-formatted-date – Matt Johnson-Pint Aug 30 '19 at 18:07
  • Apologies...That last row was a typo with the extra 3. I hadn't seen the string in that format before, but it looks like the answer below resolved it. Thank you all! – Lindsay Renkel Aug 30 '19 at 18:43

1 Answers1

2

This can be done just using pandas itself. You can first convert the Date column to datetime by passing utc = True:

df['Date'] = pd.to_datetime(df['Date'], utc = True)

And then just extract the month using dt.month:

df['Month'] = df['Date'].dt.month

Output:

                              Date  Score  Month
0 2019-03-20 22:55:52.981000+00:00     10      3
1 2019-03-07 13:16:52.174000+00:00      9      3
2 2019-06-17 10:32:09.749000+00:00      1      6


From the documentation of pd.to_datetime you can see a parameter:

utc : boolean, default None

Return UTC DatetimeIndex if True (converting any tz-aware datetime.datetime objects as well).

Ankur Sinha
  • 6,473
  • 7
  • 42
  • 73