0

How can I capture a word wrapped in square brackets using word bounderies like this: \b<WORD-IN-ANGLE-BRACKETS>\b. There appears to be some kind of bug that if a square bracket is touching a word boundry it doesn't match.

Look at this example:

re.findall(r"\b(<\w+>)\b", "<A> B <C>")  # 1. output: []
re.findall(r"(<\w+>)", "<D> E <F>")      # 2. output: ['<D>', '<F>']
re.findall(r"\b(\w+)\b", "G H I")        # 3. output: ['G', 'H', 'I']
re.findall(r"\b(z<\w+>z)\b", "z<D>z z<E>z z<F>z") #4. output: ['z<D>z', 'z<E>z', 'z<F>z']

As you can see in #4, if I put something between the square bracket and the word boundry it works, so this only happens when they are touching.

What is going on here? Why doesn't #1 work but #4 works? How can I make #1 work?

anthonybell
  • 5,790
  • 7
  • 42
  • 60

0 Answers0