258

I have a CSV dumpfile from a Blackberry IPD backup, created using IPDDump. The date/time strings in here look something like this (where EST is an Australian time-zone):

Tue Jun 22 07:46:22 EST 2010

I need to be able to parse this date in Python. At first, I tried to use the strptime() function from datettime.

>>> datetime.datetime.strptime('Tue Jun 22 12:10:20 2010 EST', '%a %b %d %H:%M:%S %Y %Z')

However, for some reason, the datetime object that comes back doesn't seem to have any tzinfo associated with it.

I did read on this page that apparently datetime.strptime silently discards tzinfo, however, I checked the documentation, and I can't find anything to that effect documented here.

Is there any way to get strptime() to play nicely with timezones?

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
victorhooi
  • 16,775
  • 22
  • 90
  • 113
  • 1
    Can't you just... convert all dates to GMT? – Robus Jul 22 '10 at 02:48
  • 2
    @Robus: Hmm, I was hoping to do that - but I was assuming that strftime/datetime could somehow do that? Either way, I need to store/parse the fact that the datetimes are in the EST timezone, or whatever timezone they happen to me. The script needs to be able to parse generic datetimes with timezone info (e.g. ETC could be any other timezone). – victorhooi Jul 22 '10 at 03:00
  • 4
    EST is also a US timezone abbreviation. (Similarly BST is both a UK and a Brazilian timezone abbrev.) Such abbreviations are just inherently ambiguous. Use offsets relative to UTC/GMT instead. (If you need to support abbreviations, you need to make the mapping locale-dependent and that's a messy rat-hole.) – Donal Fellows Jul 22 '10 at 08:14
  • 1
    [EST timezone abbreviation is ambiguous](http://stackoverflow.com/a/13713813/4279). See also: [Parsing date/time string with timezone abbreviated name in Python?](http://stackoverflow.com/q/1703546/4279) – jfs Aug 21 '14 at 10:42

5 Answers5

442

I recommend using python-dateutil. Its parser has been able to parse every date format I've thrown at it so far.

>>> from dateutil import parser
>>> parser.parse("Tue Jun 22 07:46:22 EST 2010")
datetime.datetime(2010, 6, 22, 7, 46, 22, tzinfo=tzlocal())
>>> parser.parse("Fri, 11 Nov 2011 03:18:09 -0400")
datetime.datetime(2011, 11, 11, 3, 18, 9, tzinfo=tzoffset(None, -14400))
>>> parser.parse("Sun")
datetime.datetime(2011, 12, 18, 0, 0)
>>> parser.parse("10-11-08")
datetime.datetime(2008, 10, 11, 0, 0)

and so on. No dealing with strptime() format nonsense... just throw a date at it and it Does The Right Thing.

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Joe Shaw
  • 22,066
  • 16
  • 70
  • 92
  • 1
    Given that so many people tend to use python-dateutil, I'd like to point us one limitation of that lib. `>>> parser.parse("Thu, 25 Sep 2003 10:49:41,123 -0300") Traceback (most recent call last): File "", line 1, in File "/Users/wanghq/awscli/lib/python2.7/site-packages/dateutil/parser.py", line 748, in parse return DEFAULTPARSER.parse(timestr, **kwargs) File "/Users/wanghq/awscli/lib/python2.7/site-packages/dateutil/parser.py", line 310, in parse res, skipped_tokens = self._parse(timestr, **kwargs) TypeError: 'NoneType' object is not iterable` – wanghq Apr 29 '14 at 00:16
  • 2
    @wanghq you need to replace the last comma with period. Then `parser.parse("Thu, 25 Sep 2003 10:49:41.123 -0300") returns: datetime.datetime(2003, 9, 25, 10, 49, 41, 123000, tzinfo=tzoffset(None, -10800))` – flyingfoxlee Aug 01 '14 at 08:37
  • 11
    @flyingfoxlee, yes, I understand that. I just want to tell people the limitation of python-dateutil. It does magic things, but sometimes fails to do that. So "just throw a date at it and it Does The Right Thing." is not 100% true. – wanghq Aug 01 '14 at 23:38
  • also, [`dateutil` may fail to represent ambiguous local times](https://github.com/dateutil/dateutil/issues/112). If your application can't tolerate ~1h errors, use `pytz`-based solution when working with timezones in Python. – jfs Dec 04 '15 at 10:11
  • 6
    ```dateutil.parser.parse("10-27-2016 09:06 AM PDT")``` returns: ```datetime.datetime(2016, 10, 27, 9, 6)``` fails to figure out time zone... – HaPsantran Nov 01 '16 at 01:12
  • nice, even parsing a string like `weirddtstring = '04\Nov\2013:16:19:20+0100'` succeded with `parser.parse(weirddtstring, dayfirst=True, fuzzy=True)` in case someone else comes across such rather uncommon log-entries... – antiplex Jul 03 '18 at 15:35
  • 4
    It depends on one's goal. `dateutil parser` may be simple to use, but `strptime()` is faster. Besides, its formats are quite easy to learn. – rapture Nov 21 '18 at 09:25
  • –1: sure, python-dateutil is a really useful library! But this doesn't actually answer the question at all. And the dateutil parser is an order of magnitude slower. – wim May 15 '19 at 17:04
  • `NameError: name 'tzlocal' is not defined`. – ar2015 Sep 21 '21 at 21:52
97

Since strptime returns a datetime object which has tzinfo attribute, We can simply replace it with desired timezone.

>>> import datetime

>>> date_time_str = '2018-06-29 08:15:27.243860'
>>> date_time_obj = datetime.datetime.strptime(date_time_str, '%Y-%m-%d %H:%M:%S.%f').replace(tzinfo=datetime.timezone.utc)
>>> date_time_obj.tzname()
'UTC'
Surya Teja
  • 1,220
  • 8
  • 9
  • 4
    Not all timestamp strings are UTC-based (for example, the one in the question). – Mew Mar 21 '21 at 22:56
  • 2
    This will not work correctly for many of timezone. For eg: doing this for `Asia/Kolkata` gives an offset of `tzinfo= – Irfanuddin Jul 28 '21 at 12:24
  • 7
    @iudeen what you describe is the result of an incorrectly localized `pytz` timezone object. With pytz, you *must* localize, don't use replace! With Python 3.9 however, you should use [zoneinfo](https://docs.python.org/3/library/zoneinfo.html) instead, which avoids that pitfall altogether. Safe to `replace` there. – FObersteiner Oct 22 '21 at 08:53
92

The datetime module documentation says:

Return a datetime corresponding to date_string, parsed according to format. This is equivalent to datetime(*(time.strptime(date_string, format)[0:6])).

See that [0:6]? That gets you (year, month, day, hour, minute, second). Nothing else. No mention of timezones.

Interestingly, [Win XP SP2, Python 2.6, 2.7] passing your example to time.strptime doesn't work but if you strip off the " %Z" and the " EST" it does work. Also using "UTC" or "GMT" instead of "EST" works. "PST" and "MEZ" don't work. Puzzling.

It's worth noting this has been updated as of version 3.2 and the same documentation now also states the following:

When the %z directive is provided to the strptime() method, an aware datetime object will be produced. The tzinfo of the result will be set to a timezone instance.

Note that this doesn't work with %Z, so the case is important. See the following example:

In [1]: from datetime import datetime

In [2]: start_time = datetime.strptime('2018-04-18-17-04-30-AEST','%Y-%m-%d-%H-%M-%S-%Z')

In [3]: print("TZ NAME: {tz}".format(tz=start_time.tzname()))
TZ NAME: None

In [4]: start_time = datetime.strptime('2018-04-18-17-04-30-+1000','%Y-%m-%d-%H-%M-%S-%z')

In [5]: print("TZ NAME: {tz}".format(tz=start_time.tzname()))
TZ NAME: UTC+10:00
mikey
  • 133
  • 2
  • 10
John Machin
  • 81,303
  • 11
  • 141
  • 189
  • 19
    Related Python bug: [%Z in strptime doesn't match EST and others](http://bugs.python.org/issue22377) – jfs Sep 29 '14 at 12:11
12

Your time string is similar to the time format in rfc 2822 (date format in email, http headers). You could parse it using only stdlib:

>>> from email.utils import parsedate_tz
>>> parsedate_tz('Tue Jun 22 07:46:22 EST 2010')
(2010, 6, 22, 7, 46, 22, 0, 1, -1, -18000)

See solutions that yield timezone-aware datetime objects for various Python versions: parsing date with timezone from an email.

In this format, EST is semantically equivalent to -0500. Though, in general, a timezone abbreviation is not enough, to identify a timezone uniquely.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

Ran into this exact problem.

What I ended up doing:

# starting with date string
sdt = "20190901"
std_format = '%Y%m%d'

# create naive datetime object
from datetime import datetime
dt = datetime.strptime(sdt, sdt_format)

# extract the relevant date time items
dt_formatters = ['%Y','%m','%d']
dt_vals = tuple(map(lambda formatter: int(datetime.strftime(dt,formatter)), dt_formatters))

# set timezone
import pendulum
tz = pendulum.timezone('utc')

dt_tz = datetime(*dt_vals,tzinfo=tz)
Chris
  • 28,822
  • 27
  • 83
  • 158