I'm making a news aggregator using Python and Scrapy and cannot find an answer for exactly what I'm trying to do.
I am scraping a line of text from an article, a publish time, like so:
item['published'] = hxs.select('//div[@class="date"]/text()').extract()
This is what I'm getting back (there is no ISO date on the site, as there are some of the others I'm scraping for this project):
Last Updated: Tuesday, March 11, 2014
I need to put these dates and times into a format that I can also convert other sources' publish times and so that I can order them chronologically later via that key in the JSON feed.
So with a date in that format, how can I convert it to a usable form? I'd like in the end to have all the ISO dates and those written-out text formats converted to something like this:
Published: 2:15 p.m., March 15, 2014.