I'm using a regex to parse some HTML I have the following regex which matches all tags except img and a.
\<(?!img|a)[^\>]+\>
This works well but I also want it to match the closing tags, I've tried the following but it doesn't work:
\</?(?!img|a)[^\>]+\>
What would be the best way to do this?
(Also before there is a plethora of comments saying not to use regexes to parse HTML I'd just like to say that this HTML is generated by a tool and is very uniform.)
EDIT:
<p>So in this</p>
<p>HTML <strong>with nested tags</strong></p>
<p>It should remove <i>everything</i> except <a href="#">This link</a>
and this <img src="#" alt="image" /> but it also needs to kep the textual content</p>