Foreward
I recommend using a parsing engine for this, however it sounds like you have creative control over the complexity of your HTML. So as long as you do not have complex nesting situations or other odd edge cases, then this should work.
Description
(<tag2>.*?</tag2>)|<tag>(?:(?!<tag\s?>).)*

This regular expression will do the following:
- populate capture group 1 with
<tag2>...</tag2
providing this tag is not already enclosed inside <tag>...</tag>
like <tag>.<tag2>..</tag2>.</tag>
- This will also match all
<tag>...<tag>
, but where this match occurs the capture group 1 will have no value.
Example
Live Demo
https://regex101.com/r/uQ7xR5/1
Sample text
This <tag2>is a WORD</tag2> --- Match
<TAG><TAG2>xxx</TAG2></TAG> --- Not a match
<TAG>xxxxxxx<TAG2>yyyy</TAG2>xxxxxxx</TAG> --- Not a match
Sample Matches
Note how capture group 1 is only popoulated by the <tag2>...</tag2
where it was not encapsulated inside <tag>..</tag>
[0][0] = <tag2>is a WORD</tag2>
[0][1] = <tag2>is a WORD</tag2>
[1][0] = <TAG><TAG2>xxx</TAG2></TAG> --- Not a match
[1][1] =
[2][0] = <TAG>xxxxxxx<TAG2>yyyy</TAG2>xxxxxxx</TAG> --- Not a match
[2][1] =
Explanation
NODE EXPLANATION
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
<tag2> '<tag2>'
----------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
----------------------------------------------------------------------
</tag2> '</tag2>'
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
<tag> '<tag>'
----------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
----------------------------------------------------------------------
(?! look ahead to see if there is not:
----------------------------------------------------------------------
<tag '<tag'
----------------------------------------------------------------------
\s? whitespace (\n, \r, \t, \f, and " ")
(optional (matching the most amount
possible))
----------------------------------------------------------------------
> '>'
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
. any character except \n
----------------------------------------------------------------------
)* end of grouping
----------------------------------------------------------------------