The next could be used for dates with mistakes in month string with python:
"".join((re.compile('(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(\.)?(\w*)?(\.)?(\s*\d{0,2}\s*),(\s*\d{4})', re.S + re.I).findall('Some wrong date is Septeme 28, 2002date') + ['n/a'])[0])
Output is:
'Septeme 28 2002'
1 group is a month star:
(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)
2-4 groups are optional suffixes of a month which could include a dot or alphanumeric characters:
(\.)?(\w*)?(\.)?
It matches .
, t.
tem
in Sep., Sept., Septem
5 group is date number which could be or could not be, so 0 in the expression stands for dates without date number:
(\s*\d{0,2}\s*)
6 group is a year:
(\s*\d{4})
\s*
stands for possible 'empty' characters (spaces, tabs and so on) from 0 to many
[0]
takes the first matching if a few dates tuples in the list
+ ['n/a']
could be added as an additional list element in case if no date matched, so at least 1 element in the list would exist and no 'list index out of range' error appear when [0] element is being taken