I would like to be able to get "Target" out of this block of HTML when it appears in a page:
<h3>
<a href="http://link"> Target
</a> </h3>
I can count on the spacing being reliably there. What I can't count on is that "Target" will always be included in an anchor tag. Sometimes, it looks like this:
<h3>
Target
</h3>
I can match the first version and extract "Target" pretty easily with this regex:
/<h3>\s+<a href=.*>\s+(.*)\s+<\/a>\s+<\/h3>/
But I'm struggling to write one that will match both. Any ideas?