-2

Say I have a this string:

text = "bla bl4a 12 bla 23 bla"

I want to replace all the numbers that are not a part of a word with a token <num>.

I know I can replace all numbers of a string like this:

text = re.sub(r"(\d+)", "<num>", text)

Unfortunately this also replace bl4a with bl<num>a. This should be the output:

"bla bl4a <num> bla <num> bla"
sagi
  • 40,026
  • 6
  • 59
  • 84

1 Answers1

5

Match word boundaries either side of the number

re.sub(r"\b\d+\b", "<num>", text)
Iain Shelvington
  • 31,030
  • 3
  • 31
  • 50