3

The problem:

I'm about to parse a log file in Python 2.6. Problems arose parsing the common log date string into a time object:

13/Sep/2012:06:27:18 +0200

What I tried already

Use dateutils.parser.parse

I already tried using dateutils.parser.parse but it failed parsing it with the following error:

ValueError: unknown string format

Use time.strptime

I tried time.strptime with the format string %d/%b/%Y:%H:%M:%S %z but ran into trouble when parsing the timezone:

ValueError: 'z' is a bad directive in format '%d/%b/%Y:%H:%M:%S %z'

Does anyone know, where the error is? Or is it just the wrong approach?

Final solution

Finally I decided to use time.strptime with stripping off the timezone information:

time.strptime(datestring[:-6], '%d/%b/%Y:%H:%M:%S')

The reason don't want to use dateutils is that dateutils is way slower than strptime (which actually calls a C function).

msiemens
  • 2,223
  • 1
  • 28
  • 38

1 Answers1

4

This is what I see:

  • dateutil dislikes having the time appended to the date
  • the %z directive is not supported by your underlying C implementation (see this question)

A quick and easy solution (though, not really elegant):

>>> s = '13/Sep/2012:06:27:18 +0200'
>>> dateutil.parser.parse(s.replace(':', ' ', 1))
datetime.datetime(2012, 9, 13, 6, 27, 18, tzinfo=tzoffset(None, 7200))

As a reminder, optional third parameter to replace is the max replacement count.

Community
  • 1
  • 1
icecrime
  • 74,451
  • 13
  • 99
  • 111