1

I have strings in YMD hms format that had the timezone stripped. But I know they are in Eastern time with daylight savings time.

I am trying to convert them into epoch timestamps for UTC time.

I wrote the following function:

def ymdhms_timezone_dst_to_epoch(input_str,  tz="US/Eastern"):
    print(input_str)
    dt = datetime.datetime.fromtimestamp(time.mktime(time.strptime(input_str,'%Y-%m-%d %H:%M:%S')))
    local_dt = pytz.timezone(tz).localize(dt)
    print(local_dt.strftime('%Y-%m-%d %H:%M:%S %Z%z'))
    utc_dt = local_dt.astimezone(pytz.utc)
    print(utc_dt.strftime('%Y-%m-%d %H:%M:%S %Z%z'))    
    e = int(utc_dt.strftime("%s"))
    print(e)
    return e

Given string `2015-04-20 21:12:07` this prints:

    2015-04-20 21:12:07
    2015-04-20 21:12:07 EDT-0400 #<- so far so good?
    2015-04-21 01:12:07 UTC+0000 #<- so far so good?
    1429596727

which looks ok up to the epoch timestamp. But http://www.epochconverter.com/epoch/timezones.php?epoch=1429596727 says it should mao to Greenwich Mean Time Apr 21 2015 06:12:07 UTC.

What is wrong?

Tommy
  • 12,588
  • 14
  • 59
  • 110

3 Answers3

4

I have strings in YMD hms format that had the timezone stripped. But I know they are in Eastern time with daylight savings time.

A portable way is to use pytz:

#!/usr/bin/env python
from datetime import datetime
import pytz # $ pip install pytz

naive_dt = datetime.strptime('2015-04-20 21:12:07', '%Y-%m-%d %H:%M:%S')
tz = pytz.timezone('US/Eastern')
eastern_dt = tz.normalize(tz.localize(naive_dt))
print(eastern_dt)
# -> 2015-04-20 21:12:07-04:00

I am trying to convert them into epoch timestamps for UTC time.

timestamp = (eastern_dt - datetime(1970, 1, 1, tzinfo=pytz.utc)).total_seconds()
# -> 1429578727.0

See Converting datetime.date to UTC timestamp in Python.


There are multiple issues in your code:

  • time.mktime() may return a wrong result for ambiguous input time (50% chance) e.g., during "fall back" DST transition in the Fall

  • time.mktime() and datetime.fromtimestamp() may fail for past/future dates if they have no access to a historical timezone database on a system (notably, Windows)

  • localize(dt) may return a wrong result for ambiguous or non-existent time i.e., during DST transitions. If you know that the time corresponds to the summer time then use is_dst=True. tz.normalize() is necessary here, to adjust possible non-existing times in the input

  • utc_dt.strftime("%s") is not portable and it does not respect tzinfo object. It interprets input as a local time i.e., it returns a wrong result unless your local timezone is UTC.


Can I just always set is_dst=True?

You can, if you don't mind getting imprecise results for ambiguous or non-existent times e.g., there is DST transition in the Fall in America/New_York time zone:

>>> from datetime import datetime
>>> import pytz # $ pip install pytz
>>> tz = pytz.timezone('America/New_York')
>>> ambiguous_time = datetime(2015, 11, 1, 1, 30)
>>> time_fmt = '%Y-%m-%d %H:%M:%S%z (%Z)'
>>> tz.localize(ambiguous_time).strftime(time_fmt)
'2015-11-01 01:30:00-0500 (EST)'
>>> tz.localize(ambiguous_time, is_dst=False).strftime(time_fmt) # same
'2015-11-01 01:30:00-0500 (EST)'
>>> tz.localize(ambiguous_time, is_dst=True).strftime(time_fmt) # different
'2015-11-01 01:30:00-0400 (EDT)'
>>> tz.localize(ambiguous_time, is_dst=None).strftime(time_fmt) 
Traceback (most recent call last):
...
pytz.exceptions.AmbiguousTimeError: 2015-11-01 01:30:00

The clocks are turned back at 2a.m. on the first Sunday in November:

clocks are turned back

is_dst disambiguation flag may have three values:

  • False -- default, assume the winter time
  • True -- assume the summer time
  • None -- raise an exception for ambiguous/non-existent times.

is_dst value is ignored for existing unique local times.

Here's a plot from PEP 0495 -- Local Time Disambiguation that illustrates the DST transition: utc vs. local time in the fold

The local time repeats itself twice in the fold (summer time -- before the fold, winter time -- after).

To be able to disambiguate the local time automatically, you need some additional info e.g., if you read a series of local times then it may help if you know that they are sorted: Parsing of Ordered Timestamps in Local Time (to UTC) While Observing Daylight Saving Time.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • I am now using this answer. However, can you please elaborate on ". If you know that the time corresponds to the summer time then use is_dst=True"? I want a generic function that converts all times as shown in my question. Can I just always set `is_dst=True`? – Tommy Sep 23 '15 at 13:18
  • @Tommy: I've elaborated on `is_dst` flag. – jfs Sep 23 '15 at 13:45
  • just to make sure I understand: we set out clocks back at 11pm Eastern on date X to 10pm. So all times in [X: 10pm, X:10:59:59] actually happen twice. So a timestamp in this interval is "ambiguous". the `is_dst` flag determines how we resolve this ambiguity within this exact interval. – Tommy Sep 23 '15 at 14:46
  • @Tommy: at 2 a.m. (as shown in the picture) the clocks are turned back by an hour and therefore the time between 1am and 2am occurs twice (the code example use 1:30am -- in the middle of the interval). – jfs Sep 23 '15 at 14:52
2

First of all '%s' is not supported on all platforms , its actually working for you because your platform C library’s strftime() function (that is called by Python) supports it. This function is what is causing the issue most probably, I am guessing its not timezone aware , hence when taking difference from epoch time it is using your local timezone, which is most probably EST(?)

Instead of relying on '%s' , which only works in few platforms (linux, I believe) , you should manually subtract the datetime you got from epoch (1970/1/1 00:00:00) to get the actual seconds since epoch . Example -

e = (utc_dt - datetime.datetime(1970,1,1,0,0,0,tzinfo=pytz.utc)).total_seconds()

Demo -

>>> (utc_dt - datetime.datetime(1970,1,1,0,0,0,tzinfo=pytz.utc)).total_seconds()
1429578727.0

This correctly corresponds to the date-time you get.

Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
  • i did `return (utc_dt - datetime.datetime.utcfromtimestamp(0)).total_seconds()`, is this the same? – Tommy Sep 22 '15 at 18:56
  • If that is working for you, for some reason, only this works for me - `(utc_dt - datetime.datetime.utcfromtimestamp(0).replace(tzinfo=pytz.utc)).total_seconds()` . you need the `.replace()` because `utcfromtimestamp()` returns a datetime with tzinfo set to `None` (as given in [documentation](https://docs.python.org/2/library/datetime.html#datetime.datetime.utcfromtimestamp) ) – Anand S Kumar Sep 22 '15 at 18:57
  • @Tommy: [`strftime('%s')` is not the only issue in your code](http://stackoverflow.com/a/32727761/4279) – jfs Sep 22 '15 at 22:20
0

I don't exactly know why but you have to remove the timezone info from your utc_dt before using %s to print it.

e = int(utc_dt.replace(tzinfo=None).strftime("%s"))
print(e)
return e
Josh J
  • 6,813
  • 3
  • 25
  • 47
  • Is your local time UTC? this does not produce any difference for me. – Anand S Kumar Sep 22 '15 at 19:05
  • @AnandSKumar My local time is US/Eastern. It produced the same `1429578727` for me in a python shell on OS X. It is probably my system's `%s` implementation. – Josh J Sep 22 '15 at 19:07