The alternation operator (|
) has the lowest precedence of all regex operators. That is, it tells the regex engine to match either everything to the left of the vertical bar or everything to the right of the bar.
So the regular expression r"\d+\s|-\b)"
means (one or more digits followed by a space) OR (a dash followed by a word boundary).
If you want to limit the reach of the alternation, you need to use parentheses for grouping. Or, since you want to alternate between only two characters, you can use a character class instead.
import re
txt = "123 4 56-7 maine x1s56"
x = re.findall(r"\d+[\s-]", txt)
print(x)
Output:
['123 ', '4 ', '56-', '7 ']