How do I extract the text from link HTML element if the URL matches a particular domain?
E.g. extract hello
from:
<a href="https://example.com/2018/11/22/ff/">hello</a>
If the URL wasn't example.com
, then it should ignore it.
I'm using regex </?a(|\s+[^>]+)>
but it works for all domains when it should only work for example.com
.