-1

I would like to match a string using regex in python which contains a specific string (lazy match) but haven't figured out how to do so.

For instance, in the following example, how do I return just '<tag1>some text<tag2>some other text</tag2><tag1>' and not the whole string

#!/bin/python3
import re
pattern = r'(<([a-zA-Z0-9]+?)\b[^>]*>.*?<tag2>some other text</tag2>.*?</\2>)'
text = '<root> <tag1>some text<tag2>some other text</tag2></tag1> </root>'
print(re.search(pattern, text, re.DOTALL).groups(0))

The code above prints <root> <tag1>some text<tag2>some other text</tag2></tag1> </root> when I want it to print <tag1>some text<tag2>some other text</tag2></tag1> Of course, all of this assuming that there can be any tag in the place of tag1

Gionikva
  • 3
  • 3

1 Answers1

0

Turns out, the solution is quite simple,here's the regex that works:

.*(<([a-zA-Z0-9]+?)\b[^>]*>.*?<tag2>some other text</tag2>.*?</\2>).*

Gionikva
  • 3
  • 3