I need a regular expression to match XML start nodes like the following,
normal cases
<ref>
and<ref name="gbtribune.files.wordpress.com">
empty attribute
<ref name="gbtribune.files.wordpress.com" name2>
or<ref name="gbtribune.files.wordpress.com" name2= >
missing quotes
<ref name=gbtribune.files.wordpress.com>
or<ref name="gbtribune.files.wordpress.com>
or<ref name=gbtribune.files.wordpress.com">
but I do not want it to match a self-closing nodes
<ref/>
or
<ref name=gbtribune.files.wordpress.com" />
I also want the first group to capture the tag name, and the second group to capture all key-value attribute pairs.
My regex is designed as
<([a-zA-Z]+)\s*([^\/<>"=\s]+=?(?:(?:"(?:[^<>"]*)"?)|(?:[^=<>"\s]*"?))?\s*)*>
You can open it here https://regex101.com/r/TVwye1/3
It works for case 1,2,3, but it also matches the a self-closing nodes. Need help to exclude the self-closing nodes from the matches.