-2
import re
pattern = "[0-9]+[st|nd|rd|th]?"
str2 = "1st 1 2 3 4 5th "
a = re.findall(pattern, str2)
print(a)

Expected Output

['1st', '1', '2', '3', '4', '5th']

Actual Output

['1s', '1', '2', '3', '4', '5t']
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Python Newbie
  • 317
  • 2
  • 9

1 Answers1

0
import re
pattern = r'[0-9]+(?:st|nd|rd|th)?'
str2 = "1st 1 2 3 4 5th "
a = re.findall(pattern, str2)
print(a)

Output: ['1st', '1', '2', '3', '4', '5th']

Your mistake was to use square brackets instead of parentheses. Square brackets usually used to indicate a range (like 0-9) rather then "OR" which is what you wanted in the suffix. For further reading: What is the difference between square brackets and parentheses in a regex?

If you change only the parentheses, output will be as follow:

[('1', 'st'), ('1', ''), ('2', ''), ('3', ''), ('4', ''), ('5', 'th')]

This is still not what you wanted, so we made a second change to pattern: adding ?: is the syntax that creates a non-capturing group (means "treat it as one group").

Roim
  • 2,986
  • 2
  • 10
  • 25