I'm supposed to find all the dates from a text document. The dates are in the format of either "24th of April" or "December 18th". I wrote a code which does the job but output is messy.
I've tried to combine the two regex with "|" operator but then I'm getting lots of blank spaces in output.
d1 = "(January|February|March|April|May|June|July|August|September|October|November|December)\s+([0-9]{1,2})(st|nd|rd|th)"
d2 = "([0-9]{1,2})(st|nd|rd|th)\s+(of)\s+(January|February|March|April|May|June|July|August|September|October|November|December)"
e1 = re.compile(d1)
e2 = re.compile(d2)
dat1 = re.findall(e1, text)
dat2 = re.findall(e2, text)
print("\nList of dates in collection are : " + str(dat1) + str(dat2))
Actual Result:
[('January', '6', 'th'), ('January', '2', 'nd')][('4', 'th', 'of', 'March')]
Expected Result:
[('January 6th'), ('January 2nd'), ('4th of March')]