1

In Python, I am trying to do

text = re.sub(r'\b%s\b' % word, "replace_text", text)

to replace a word with some text. Using re rather than just doing text.replace to replace only if the whole word matches using \b. Problem comes when there are characters like +, (, [ etc in word. For example +91xxxxxxxx.

Regex treats this + as wildcard for one or more and breaks with error. sre_constants.error: nothing to repeat. Same is in the case of ( too.

Could find a fix for this after searching around a bit. Is there a way?

Mazdak
  • 105,000
  • 18
  • 159
  • 188
pratpor
  • 1,954
  • 1
  • 27
  • 46

1 Answers1

2

Just use re.escape(string):

word = re.escape(word)
text = re.sub(r'\b{}\b'.format(word), "replace_text", text)

It replaces all critical characters with a special meaning in regex patterns with their escape forms (e.g. \+ instead of +).


Just a sidenote: formatting with the percent (%) character is deprecated and was replaced by the .format() method of strings.

linusg
  • 6,289
  • 4
  • 28
  • 78