9
Mon Jul 09 09:20:28 +0000 2012

If I have a format like that as a STRING, how can I turn it into a unix timestamp?

Note: I'm getting this format from Twitter's API:

https://api.twitter.com/1/statuses/user_timeline.json?include_entities=true&include_rts=true&screen_name=twitter

TIMEX
  • 259,804
  • 351
  • 777
  • 1,080
  • You'll need to parse it: http://docs.python.org/library/datetime.html#strftime-strptime-behavior – Blender Jul 09 '12 at 20:16
  • see this http://devpython.com/2011/08/11/convert-utc-date-string-to-unix-timestamp/ – Ashwini Chaudhary Jul 09 '12 at 20:18
  • @Blender: The OP's format seems to be tricky. For the `+0000` he needs `%z` but that results in `ValueError: 'z' is a bad directive in format '%a %b %d %H:%M:%S %z %Y'` – ThiefMaster Jul 09 '12 at 20:22
  • 3
    Why can't API creators just stick with ISO 8601? >. – Wayne Werner Jul 09 '12 at 20:27
  • @ThiefMaster, it looks like it works using Python3.2 (at least Portable Python). Apparently it depends on your underlying implementation of C: http://stackoverflow.com/questions/2609259/converting-string-to-datetime-object-in-python – Wayne Werner Jul 09 '12 at 20:35

3 Answers3

9

The best option is using dateutil.parser.parse() which gives you a datetime object with proper timezone information:

>>> import dateutil.parser
>>> dt = dateutil.parser.parse('Mon Jul 09 09:20:28 +0200 2012')
>>> dt
datetime.datetime(2012, 7, 9, 9, 20, 28, tzinfo=tzoffset(None, 7200))

Now you just need to convert it to a UNIX timestamp:

>>> import time
>>> int(time.mktime(dt.timetuple()))
1341822028

The format you have can also be easily parsed using email.utils.parsedate_tz:

>>> import datetime
>>> import email.utils
>>> parts = email.utils.parsedate_tz('Mon Jul 09 09:20:28 +0200 2012')
>>> dt = datetime.datetime(*parts[:6]) - datetime.timedelta(seconds=parts[-1])
>>> str(dt)
'2012-07-09 07:20:28'

This is actually how email.utils.parsedate_to_datetime in Python 3.3 is implemented (if you want to copy&paste this into your project, replace __parsedate_tz with parsedate_tz from email.utils):

def parsedate_to_datetime(data):
    if not data:
        return None
    *dtuple, tz = __parsedate_tz(data)
    if tz is None:
        return datetime.datetime(*dtuple[:6])
    return datetime.datetime(*dtuple[:6],
            tzinfo=datetime.timezone(datetime.timedelta(seconds=tz)))
ThiefMaster
  • 310,957
  • 84
  • 592
  • 636
1

If the timezone is known to be always +0000, you can use:

time.strptime('Mon Jul 09 09:20:28 +0000 2012', '%a %b %d %H:%M:%S +0000 %Y')

This returns a datetime structure. If you need unix seconds since epoch, run through time.mktime(), like this:

>>> time.mktime(time.strptime('Mon Jul 09 09:20:28 +0000 2012', '%a %b %d %H:%M:%S +0000 %Y'))
1341818428.0

or time.gmtime() if indeed the timezone is always UTC.

Andre Blum
  • 391
  • 3
  • 6
0

datetime.strptime()

Reference: http://docs.python.org/library/time.html#time.strptime

Daniel Li
  • 14,976
  • 6
  • 43
  • 60
  • So, why not include some example code that actually parses the format he used without an error? – ThiefMaster Jul 09 '12 at 20:20
  • then you need to convert to unix timestamp... see http://stackoverflow.com/questions/2775864/python-datetime-to-unix-timestamp @thiefmaster ... I think that hope's answer is sufficient to get him where he wants to be as well as allowing him to figure it out for other formats – Joran Beasley Jul 09 '12 at 20:21
  • @JoranBeasley: See my comment on the question. His format is apparently not *that* trivial to parse. – ThiefMaster Jul 09 '12 at 20:24
  • its real easy if he removes the offset(with regex or splits or replace or whatever) and just adds it at the end (or ignore it totally...especially if it is always +0000) – Joran Beasley Jul 09 '12 at 20:29