0

I am working with an Excel file in Pandas where I am trying to deal with a Date column where the Date is listed in ISO 8601 format. I want to take this column and store the date and time in two different columns.The values in these two columns need to be stored in Eastern Daylight Savings. This is what they are supposed to look like

Date                    Date (New)  Time (New)
1999-01-01T00:00:29.75  12/31/1998  6:59:58 PM
1999-01-01T00:00:30.00  12/31/1998  6:59:59 PM
1999-01-01T00:00:32.25  12/31/1998  7:00:00 PM
1999-01-01T00:00:30.50  12/31/1998  6:59:58 PM

I have achieved this, partially. I have converted the values to Eastern Daylight savings time and successfully stored the Date value correctly. However, I want the time value to be stored in the 12 hours format and not in the 24 hours format as it is being right now?

This is what my output looks like so far.

Date                  Date (New)    Time (New)
1999-01-01T00:00:29.75  1998-12-31  19:00:30
1999-01-01T00:00:30.00  1998-12-31  19:00:30
1999-01-01T00:00:32.25  1998-12-31  19:00:32
1999-01-01T00:00:30.50  1998-12-31  19:00:31

Does anyone have any idea what i can do for this?

from pytz import timezone

import dateutil.parser

from pytz import UTC

import datetime as dt

df3['Day']=pd.to_datetime(df['Date'], format='%Y-%m-%d %H:%M:    %S.%f',errors='coerce').dt.tz_localize('UTC')

df3['Day']= df3['Day'].dt.tz_convert('US/Eastern')
df3['Date(New)'], df3['Time(New)'] = zip(*[(d.date(), d.time()) for d in  df3['Day']])
ALollz
  • 57,915
  • 7
  • 66
  • 89
sanster9292
  • 1,146
  • 2
  • 9
  • 25
  • why do you think that your time is "stored in the 24 hours format" ? it is stored as datetime.time ... this is just a display issue – Patrick Artner Jan 03 '19 at 22:37

3 Answers3

3

You should use d.time().strftime("%I:%M:%S %p") which will format the date as requested.

strftime() and strptime() Behavior

Nick Perkins
  • 1,327
  • 2
  • 12
  • 25
  • @StevenJohnson You were not quite right but on the right track. See my comment on your answer. – Nick Perkins Jan 03 '19 at 22:45
  • this would set the output format of all datetime.time() colums? – Patrick Artner Jan 03 '19 at 22:48
  • @PatrickArtner in this case there is only one datetime.time() column and would only work on that column. – Nick Perkins Jan 03 '19 at 22:53
  • the dataframe is df3. d is just a variable i am using for list comprehension – sanster9292 Jan 03 '19 at 23:08
  • Nice answer, nice comments. – Ami Tavory Jan 03 '19 at 23:30
  • @sanster9292 now I got it. You probably need one to two more df-columns with special formatted strings - changing your list comp to `df['Date(New)'], df['Time(New)'] = zip(*[(d.date(), d.time().strftime("%I:%M:%S %p")) for d in df['Day']])` will kill any chances to do something "time()" based with that column. You can create a special stringified columun by `df['Time(asstring)'] = df["Day"].dt.strftime("%I:%M:%S %p")` – Patrick Artner Jan 03 '19 at 23:45
  • Patrick!! That works!! However, there is just one thing. The values seem to be 32 seconds ahead of what is the expected value. This is the first value I got: 07:00:29 PM when it should be 6:59:58 PM. Any Idea why this is? – sanster9292 Jan 04 '19 at 01:58
1

You can set the time format used for outputting - the time value itself is (and should be) stored as datetime.time() - if you want a specific string representation you can create a string-type column in the format you want:

from pytz import timezone 
import pandas as pd
import datetime as dt

df= pd.DataFrame([{"Date":dt.datetime.now()}]) 

df['Day']=pd.to_datetime( df['Date'], format='%Y-%m-%d %H:%M:    %S.%f',
                          errors='coerce').dt.tz_localize('UTC')

df['Day']= df['Day'].dt.tz_convert('US/Eastern')
df['Date(New)'], df['Time(New)'] = zip(*[(d.date(), d.time()) for d in  df['Day']]) 

# create strings with specific formatting
df['Date(asstring)'] = df['Day'].dt.strftime("%Y-%m-%d")
df['Time(asstring)'] = df["Day"].dt.strftime("%I:%M:%S %p") 

# show resulting column / cell types
print(df.dtypes)
print(df.applymap(type))
# show df
print(df)

Output:

# df.dtypes
Date                          datetime64[ns]
Day               datetime64[ns, US/Eastern]
Date(New)                             object
Time(New)                             object
Date(asstring)                        object
Time(asstring)                        object

# from df.applymap(type)
Date            <class 'pandas._libs.tslib.Timestamp'>
Day             <class 'pandas._libs.tslib.Timestamp'>  
Date(New)       <class 'datetime.date'>
Time(New)       <class 'datetime.time'>
Date(asstring)  <class 'str'>
Time(asstring)  <class 'str'>

# from print(df)
                        Date                              Day   Date(New)        Time(New) 
0 2019-01-04 00:40:02.802606 2019-01-03 19:40:02.802606-05:00  2019-01-03  19:40:02.802606 

Date(asstring) Time(asstring)
    2019-01-03    07:40:02 PM
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
  • Patrick, I can't thank you enough. I am a a bit of a novice when it comes to handling date times. I am hoping to get better at it!! Now, just a bit more of a help, if you could!!! – sanster9292 Jan 04 '19 at 02:48
  • The Time(asstring) column is about 32 seconds ahead from what I would expect. I am getting 07:00:29 PM when I should be getting 6:59:58 PM. Do you have any idea why that might be? – sanster9292 Jan 04 '19 at 02:49
  • @sanster9292 I used datetime.datetime.now as input - and over _all_ times that I create the time is the same : `00:40:02.802606 19:40:02.802606-05:00 19:40:02.802606 07:40:02 PM` -no 30s difference between those. Does my code give you differences or does your data give you differences? 5h ago was 4 am for me .. there is weired stuff going on if you go to some special dates or curious timezones in the past ... link: [weird-timezone-issue-with-pytz](https://stackoverflow.com/questions/11473721/weird-timezone-issue-with-pytz) – Patrick Artner Jan 04 '19 at 08:12
  • Yes I think it is just the data itself. I double and triple checked the logic. I think it also might be that i am in a weird time zone. – sanster9292 Jan 04 '19 at 13:59
0

It looks like you are very close. %H is the 24 hour format. You should use %I instead.

How can I account for period (AM/PM) with datetime.strptime?

S. J.
  • 76
  • 8
  • 1
    That format is correct because it determines how to read the string date in from the dataframe. It's actually when creating the new dataframe that the formatting needs to change. – Nick Perkins Jan 03 '19 at 22:44