Your best shot is probably the parsedatetime module.
Here's your example:
>>> import parsedatetime
>>> cal = parsedatetime.Calendar()
>>> cal.parse('July 1, 2013, midnight')
((2013, 7, 1, 0, 0, 0, 0, 245, 0), 3)
cal.parse()
returns a tuple of two items. The first is the modified parsedatetime.Calendar
object, the second is an integer, as explained in the docstring of the parse
method:
- 0 = not parsed at all
- 1 = parsed as a C{date}
- 2 = parsed as a C{time}
- 3 = parsed as a C{datetime}
A few words on strptime
:
strptime
won't be able to understand "midnight", but you can replace it with an actual hour, using something like this:
def fix_dt(raw_date):
"""Replace 'midnight', 'noon', etc."""
return raw_date.replace('midnight', '0').replace('noon', '12')
def parse_dt(raw_date):
"""Parse the fuzzy timestamps."""
return datetime.datetime.strptime(fix_dt(raw_date), '%B %d, %Y, %H')
Then:
>>> parse_dt('July 1, 2013, midnight')
datetime.datetime(2013, 7, 1, 0, 0)
You can play on strfti.me to see which one will match your format.
You should check out this other question. The answers suggest using parsedatetime and pyparsing to parse fuzzy timestamps like the one in your example. Also check this pyparsing wiki page.