3

I am attempting to parse the following date strings obtained from email headers:

from dateutil import parser
d1 = parser.parse('Tue, 28 Jun 2011 01:46:52 +0200')
d2 = parser.parse('Mon, 11 Jul 2011 10:01:56 +0200 (CEST)')
d3 = parser.parse('Wed, 13 Jul 2011 02:00:01 +0000 (GMT+00:00)')

The third one fails; am I missing something obvious?

Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
Petter
  • 37,121
  • 7
  • 47
  • 62

2 Answers2

4

have you tried parser.parse('...', fuzzy=True)? (I suppose it works :))

phimuemue
  • 34,669
  • 9
  • 84
  • 115
  • Yes it works. The problem is the extra "+00:00" after "GMT", as pointed out below. The "fuzzy" option ignores this. – Petter Jul 19 '11 at 07:55
2

Give a try to parsedatetime library.

In [16]: import parsedatetime.parsedatetime as pdt

In [17]: p = pdt.Calendar()

In [18]: p.parse("Wed, 13 Jul 2011 02:00:01 +0000 (GMT+00:00)")
Out[18]: ((2011, 7, 20, 0, 0, 0, 2, 201, -1), 3)
Michał Bentkowski
  • 2,115
  • 16
  • 10
  • 1
    But is it correct? I have difficulty interpreting the tuple. Where is the "13", for example? – Petter Jul 19 '11 at 08:08
  • It seems that this parser is confused and thinks the "Wed" refers to tomorrow July 20, which is the closest Wednesday. – Petter Jul 19 '11 at 08:09
  • Looks like `parsedatetime` always takes future dates. it has a comment in the source code: `# if that day and month have already passed in this year, then increment the year by 1` – warvariuc Jan 10 '12 at 07:37