1

So I'm trying to match

something something bar<TEST> blah blah </TEST> foo

I want to extract the foo and I'm using this regex

(?<=</\w+>)(\s\w)

why isn't that working? I get an empty list. I get this error-

sre_constants.error: look-behind requires fixed-width pattern
praks5432
  • 7,246
  • 32
  • 91
  • 156
  • The error means exactly what it says -- `\w` can match any arbitrary amount of content, so you can't use it for lookbehind. – Charles Duffy Feb 14 '14 at 19:06

2 Answers2

1

Well, you cannot quite use lookarounds here, since the ideal one to use would be a positive lookbehind and ensure there's something like </\w+> behind. In C#, you could have used something like (?<=</\w+>)(\s*\w+), but variable width lookbehinds are not supported by python. What's left to do is perhaps to include the </\w+> in the match and use a capture group:

</[^<>]*>\s*(\w+)

regex101 demo.

Note that [^<>] is usually safer when between < and >.

Jerry
  • 70,495
  • 13
  • 100
  • 144
0

Because you want a lookbehind, but unfortunately not many libraries support quantifiers in a lookbehind. You also have a couple errors/typos:

(?<=</\w+>)\s*(\w+)
Community
  • 1
  • 1
tenub
  • 3,386
  • 1
  • 16
  • 25