I'm trying to create a Regex expression to match content within a HTML document, but I wish to exclude matches contained within a tag itself. Consider the following:
<p>Here is some sample text for my widgets</p>
<a href="http://mywidgets.nowhere">Click here to view my widgets</a>
I would like to match 'widgets' so that I can replace it with a different string, say 'green box', without replacing the match within the url.
Matching 'widgets' is, well, easy as anything, but I'm struggling to add the exclude to check for 'widgets' when it appears within the opening and closing tag '<>'.
My current workings: As a first step I have started to match 'widgets' contained within '<>'. (I can then move on to make this an exclude later) However the below string seems to match the whole document, even though I have placed an exclude on the closing > to make sure widgets appears within a tag.
<.*[^>]widgets.*[^<]>+
It's probably down to lazy / greedy, but I can't quite work it out!