-1

I am working to get familiar with regular expression in python and working with string like such:

string = "<<NAME>><<TIME>> (<<NAME>>) good <<NAME>><<NAME>> luck<<NAME>>START <<NAME>>"
# I try the following: 
output = re.sub(r'\b<<NAME>>\b', "1234", string)

However, the output prints out the exact same thing. I thought \b would isolate the word that I am looking and substitute it. How can I resolve this such that each <<NAME>> will be replaced by "1234"?

2 Answers2

1

The documentation has a definition for \b:

\b is defined as the boundary between a \w and a \W character (or vice versa), or between \w and the beginning/end of the string

Since for example, '<' and ' ' are both \W, there is no boundary the between <<TIME>> and space. Therefore, \b does not match.

For your trivial example, try:

string.replace('<<NAME>>', '1234')

If you actually do need a regular expression, just drop the \b:

re.sub('<<NAME>>', '1234', string)
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • `string.replace` will make a lot more sense in this case. I am just trying wrap my head of the inner workings of regular expression. I actually attempted to used `'.*<>.*'`, or other similar approaches. How does `re.sub()` know where to find a match in the given `string`? Does it use `<>` as a pattern? Or does it use a similar approach along the lines of iteration of the given `string` to find a match? – uservictor12350 Oct 28 '17 at 02:46
0

Try this code

for word in word_list:
    New_content = re.sub(r"\b"+word+r"\b", " *** ", old_content, flags=re.I)

Hope this will help you.

Note: "***" is a string with which you want to replace desire string. If you just want to remove only then use "" only.

This code will make sure that only desire word is removed not substring. I.e. if you want to remove word "is", simple remove word "is" from "this" also. Which is not desirable.

Harshad Kavathiya
  • 8,939
  • 1
  • 10
  • 19