0

Let's say that I have an HTML markup like this:

<p>
    <h1>Some header, which I don't want to match</h1>
    Some text - match it.
    <a href="some-file.html">Some link. Don't match neither href nor link text.<a>
    <span>Some word, which needs to be matched</span>
</p>

In few words, I want to match some word in whole of the content, except given html tags (and their attributes). In given example I want to exclude h1 and a tags.

Expected result after replacing 'Some' by 'Test':

<p>
    <h1>Some header, which I don't want to match</h1>
    Test text - match it.
    <a href="some-file.html">Some link. Don't match neither href nor link text.<a>
    <span>Test word, which needs to be matched</span>
</p>
Piotrycjan
  • 219
  • 2
  • 14

1 Answers1

0

You can use : <(a|h1)[^\>]*?>(some)[^\<]*?<\/\1> to match the lines containing some and between html tags .

And check if a line not satisfying this regex , then replace word some (if any ) with your required replacement text.

Demo

Explanation :

enter image description here

Sujith PS
  • 4,776
  • 3
  • 34
  • 61