-6

I have to search a string containing dates, so i have to find substrings that match formats (date_text1, date_text2 and date_text) and munge them into a conventional format such as 25/04/1955.

Why does the index not exist?

import re 

#date_text1 = '042555'
#date_text2 = '04/25/1955'
date_text = 'April 25, 1955'

date_patterns = (r'(\d{2}) (\d{2}) (\d{2})', r'(\d{2}) / (\d{2}) / (\d{4})', r'([\w\D]+) (\d{2}) , (\d{4})')

scan = True
idx = 0
while scan:
    match = re.fullmatch(date_patterns[idx], date_text)
    if match:
        month, day, year = match.groups()
        scan = False
    else:
        idx += 1

# Adjust for years
if int(year) <= 19:
    year = "20"+year
elif int(year) <= 99:
    year = "19"+year

# Adjust for months
months = {"January":"01", "February":"02", "March":"03", "April":"04", "May":"05", "June":"06","July":"07", "August":"08", "September":"09", "October":"10","November":"11", "December":"12"}
if len(month) > 2:
    month = months[month]

normalized_date = f'{day}/{month}/{year}'
print(normalized_date)
mid
  • 1

1 Answers1

1

Because none of your patterns ever match, the loop will keep incrementing the idx until it goes over the bounds of the tuple.

The reason for no match, is that you have a small error in your last regex pattern. This:

r'([\w\D]+) (\d{2}) , (\d{4})')

should be:

r'([\w\D]+) (\d{2}), (\d{4})')

After fixing this I'm able get the correct output from the program:

25/04/1955
ruohola
  • 21,987
  • 6
  • 62
  • 97