1

I am trying to return the date from a longer string. The date could be any weekday, any number with a subscript st, nd, rd, th and the month as 3 string values (Jan, Feb etc).

This is my attempt but I'm getting None. Not sure what I'm missing?

string = 'Times for Saturday 10th Aug'

days = ('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday')
months = ('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec')
pat = re.compile(r'^(%s) (\d+)(st|nd|rd|th) (%s)$' %
                 ('|'.join(days), '|'.join(months)))

print(re.match(pat, string))
jonboy
  • 415
  • 4
  • 14
  • 45

2 Answers2

1

A caret ^ in a regex and re.match both start matching at the start of the string. Simply remove the caret and use re.search instead.

pat = re.compile(r'(%s) (\d+)(st|nd|rd|th) (%s)$' %
                 ('|'.join(days), '|'.join(months)))

print(re.search(pat, string))
wjandrea
  • 28,235
  • 9
  • 60
  • 81
1

You should remove the beginning of line character ^ and the end of the line character $, then everything works just fine:

>>> re.findall(r'(%s) (\d+)(st|nd|rd|th) (%s)' %
...                  ('|'.join(days), '|'.join(months)), string)
[('Saturday', '10', 'th', 'Aug')]

And, please, don't call variable string -- it's already used in python.

lenik
  • 23,228
  • 4
  • 34
  • 43