I can't parse it because it is not a html file, it is a simple text and sometimes in it can be hidden a valid openings of html tags like:
<a href="..." >
but also:
<anytag par1="val1" par2='val2' par3=val3 />
and everything would be nice and easy if not this possibility:
<anytag param='square < brackets > in value' par2="and < another < such case" >
How to match this with regex ?
(This is not valid html, the tags are (may be) in a normal txt file, and are loose, that is not contained in any proper structure, and are not always closed. (But headers are of course always closed with >
, look at the examples.) I'm not interested what is inside tag, but only in opening header.)