String replace problem with inferior symbol

Question

I would like to replace the string "<75%" by "NONE". I have write this function but it doesnt match :(

replacements = {'toot': 'titi-',
                '<75%': 'NONE'}

def replace(match):

    return replacements[match.group(0)]

def clean75Case(text_page):
    return re.sub('|'.join(r'\b%s\b' % re.escape(s) for s in replacements),
           replace, text_page)

if __name__ == '__main__':
    print(clean75Case("toot iiii <75%"))

`\b` matches the boundary between a word character and a non-word character. `<` is a non-word character, as is the space before it in your string, so `\b` does't match there. — jasonharper, Jun 05 '19 at 21:34

score 1 · Accepted Answer · answered Jun 05 '19 at 21:41

As mentioned in the comments, the issue is that \b only matches the boundary between a word and non-word character. From the docs:

\b

Matches the empty string, but only at the beginning or end of a word. A word is defined as a sequence of word characters. Note that formally, \b is defined as the boundary between a \w and a \W character (or vice versa), or between \w and the beginning/end of the string

In the string you gave, the space character () and the less than character (<) are both non-word characters. So \b does not match the empty space between them.

For an alternate way to solve this problem, consider using split() to split the string into words and comparing each word against the replacement patterns like so:

replacements = {'toot': 'titi-',
                '<75%': 'NONE'}

def clean75Case(text_page):
    words = text_page.split()
    return ' '.join(replacements.get(w, w) for w in words)

if __name__ == '__main__':
    print(clean75Case("toot iiii <75%"))

String replace problem with inferior symbol

1 Answers1