Regex remove hyperlinks for a specific domain

Question

How do I extract the text from link HTML element if the URL matches a particular domain?

E.g. extract hello from:

<a href="https://example.com/2018/11/22/ff/">hello</a>

If the URL wasn't example.com, then it should ignore it.

I'm using regex </?a(|\s+[^>]+)> but it works for all domains when it should only work for example.com.

[Don't use r̩̟̳̺͙͜e̩̮̲͓͕g̳̩͎̳ḙ̙x̀ to parse HTML](https://stackoverflow.com/a/1732454/1064767), use [DomDocument](https://secure.php.net/manual/en/class.domdocument.php). Also don't use regex to parse URLs, use [`parse_url()`](https://secure.php.net/manual/en/function.parse-url.php). — Sammitch, Nov 30 '18 at 00:28
Possible duplicate of [RegEx match open tags except XHTML self-contained tags](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) — Jason Armstrong, Nov 30 '18 at 01:54
am using a pluginon wordpress that do regex so am tight with regex :) — Mahdi Farhat, Nov 30 '18 at 02:28

score 0 · Answer 1 · answered Nov 30 '18 at 13:55

I am a noob developer! But i think this will work for you!

//The website you want to block
var regex = /example.com/g
// Select all anchor text from the window
var textLink = document.querySelectorAll("a");

//Check all anchor from the windows if they contain the "example.com" if they do, the href will be replace with "#"
textLink.forEach(  e => {
    if(regex.test(e.href)){
    e.href = "#";
}});

Regex remove hyperlinks for a specific domain

1 Answers1