As mentioned in the comments, the issue is that \b
only matches the boundary between a word and non-word character. From the docs:
\b
Matches the empty string, but only at the beginning or end of a word.
A word is defined as a sequence of word characters. Note that
formally, \b is defined as the boundary between a \w and a \W
character (or vice versa), or between \w and the beginning/end of the
string
In the string you gave, the space character (
) and the less than character (<
) are both non-word characters. So \b
does not match the empty space between them.
For an alternate way to solve this problem, consider using split()
to split the string into words and comparing each word against the replacement patterns like so:
replacements = {'toot': 'titi-',
'<75%': 'NONE'}
def clean75Case(text_page):
words = text_page.split()
return ' '.join(replacements.get(w, w) for w in words)
if __name__ == '__main__':
print(clean75Case("toot iiii <75%"))