66

I've got the script below

var els = document.getElementsByTagName("a");
for(var i = 0, l = els.length; i < l; i++) {
  var el = els[i];
  el.innerHTML = el.innerHTML.replace(/link1/gi, 'dead link');
}

However this searches through the page and takes about 20 seconds to do it as there are LOTS of links.

However I only need to target the a's that have a specific href, for eg. http://domain.example/

So ideally I'd like to be able to do this in a similar fashion to jQuery, but without using a framework. So something like

var els = document.getElementsByTagName("a[href='http://domain.example']");

How would I go about doing this so it only searches the objects with that matching href?

Stephen Ostermiller
  • 23,933
  • 14
  • 88
  • 109
owenmelbz
  • 6,180
  • 16
  • 63
  • 113
  • 2
    Which browsers do you want to support? You could try [`document.querySelectorAll`](https://developer.mozilla.org/en/DOM/Document.querySelectorAll) and see if it makes a difference, but this method is not available in IE7 and earlier. Another possibility could be to use CSS3 to change the appearance and/or add some additional text. – Felix Kling May 13 '12 at 15:01
  • @FelixKling could querySelectorAll make such a big difference? It seems that the OP's code is already pretty bare .. unless not all code is shown :) – Ja͢ck May 13 '12 at 15:04
  • I would just write a function to handle onclick for all of your links. Then you can make the change if needed once somebody clicked the link. – Robert Levy May 13 '12 at 15:05
  • @Jack: I don't know how `querySelectorAll` works internally. But that's why the OP should try and test it. – Felix Kling May 13 '12 at 15:05
  • 1
    @Jack the code is bare, but it's using a property that's expensive to compute. – Alnitak May 13 '12 at 15:12
  • I think im gunna try do this a bit differently now lol, as I noticed it has to run through approx 420 A's and then 1020 divs so its going very slow! maybe i'll do it in php before it gets to the user! THANKS though – owenmelbz May 13 '12 at 15:35

2 Answers2

137

2016 update

It's been over 4 years since this question was posted and things progressed quite a bit.

You can't use:

var els = document.getElementsByTagName("a[href='http://domain.example']");

but what you can use is:

var els = document.querySelectorAll("a[href='http://domain.example']");

(Note: see below for browser support)

which would make the code from your question work exactly as you expect:

for (var i = 0, l = els.length; i < l; i++) {
  var el = els[i];
  el.innerHTML = el.innerHTML.replace(/link1/gi, 'dead link');
}

You can even use selectors like a[href^='http://domain.example'] if you want all links that start with 'http://domain.example':

var els = document.querySelectorAll("a[href^='http://domain.example']");

for (var i = 0, l = els.length; i < l; i++) {
  var el = els[i];
  el.innerHTML = el.innerHTML.replace(/link/gi, 'dead link');
}

See: DEMO

Browser support

The browser support according to Can I use as of June 2016 looks pretty good:

caniuse.com/queryselector (See: http://caniuse.com/queryselector for up to date info)

There is no support in IE6 and IE7 but IE6 is already dead and IE7 soon will be with its 0.68% market share.

IE8 is over 7 years old and it partially supports querySelectorAll - by "partially" I mean that you can use CSS 2.1 selectors like [attr], [attr="val"], [attr~="val"], [attr|="bar"] and a small subset of CSS 3 selectors which luckily include: [attr^=val], [attr$=val], and [attr*=val] so it seems that IE8 is fine with my examples above.

IE9, IE10 and IE11 all support querySelectorAll with no problems, as do Chrome, Firefox, Safari, Opera and all other major browser - both desktop and mobile.

In other words, it seems that we can safely start to use querySelectorAll in production.

More info

For more info, see:

See also this answer for the difference between querySelectorAll, querySelector, queryAll and query and when they were removed from the DOM specification.

Stephen Ostermiller
  • 23,933
  • 14
  • 88
  • 109
rsp
  • 107,747
  • 29
  • 201
  • 177
  • @OwenMelbourne this may be a more efficient way of selecting the links, but using `.innerHTML` and `.replace` like that is still the wrong way to replace the contents – Alnitak Jun 15 '16 at 11:40
  • @Alnitak and how would you do it differently to replace the word "link" with "dead link" without using replace? With substring and splice? – rsp Jun 15 '16 at 11:47
  • AFAICS that regexp `/link1/` in the original question was just a place-holder and not the literal content that the OP expected to have replaced. – Alnitak Jun 15 '16 at 11:48
  • 1
    @Alnitak I know it was a placeholder but would it make a difference if it was any other regex? My goal was basically to show that changing only `getElementsByTagName` to `querySelectorAll` would make the rest of the code work as intended with no changes. See my [demo](https://jsbin.com/luqeyuh/edit?html,js,console,output). – rsp Jun 15 '16 at 12:00
24

Reading and writing the innerHTML property on every element is probably quite expensive and hence causing your slowdown - it forces the browser to "serialize" the element, which you then run through a regexp, and then "deserialize" again. Even worse, you're doing it for every a element, even if it doesn't match.

Instead, try looking directly at the properties of the a element:

var els = document.getElementsByTagName("a");
for (var i = 0, l = els.length; i < l; i++) {
    var el = els[i];
    if (el.href === 'http://www.example.com/') {
        el.innerHTML = "dead link";
        el.href = "#";
    }
}

On modern browsers with much greater W3C conformance you can now use document.querySelectorAll() to more efficiently obtain just the links you want:

var els = document.querySelectorAll('a[href^=http://www.example.com/]');
for (var i = 0, l = els.length; i < l; i++) {
    els[i].textContent = 'dead link';
    els[i].href = '#';
}

This is however not so flexible if there are multiple domain names that you wish to match, or for example if you want to match both http: and https: at the same time.

Stephen Ostermiller
  • 23,933
  • 14
  • 88
  • 109
Alnitak
  • 334,560
  • 70
  • 407
  • 495
  • I get even better performance when I use the ".textContent || .innerText" construct; http://blogger.ziesemer.com/2007/10/innerhtml-and-innertext-slow.html – Ja͢ck May 13 '12 at 15:24
  • was definitely faster however after about 1500 loops it slowed! – owenmelbz May 13 '12 at 15:35
  • +1 for the optimized for loop only checking length once at the beginning, I would never have thought of putting it directly into the loop like that! – Georges Oates Larsen May 31 '12 at 05:49
  • `el.href === 'http://www.example.com/myfile.html'` worked under Mac OS on major browsers, but would not on Ubuntu or Windows environment in any major browsers, for some reason. I fixed the issue by using `el.href.indexOf("myfile.html")` instead. Cheers. – HelpNeeder Jun 04 '15 at 05:02
  • Why would you cache the value of `els.length` twice? First you use `els_length = els.length` (that I understand), and then `l = els_length` (that I don't understand). Isn't accessing `els_length` just as fast as `l`? – Peter Nov 04 '15 at 13:01
  • 1
    @Sorry-Im-a-N00b yes, you really are a n00b. As already written the evaluation of `els.length` is only done once, not in every iteration. Your edit is wrong, and the people that approved your edit should have known better, too. – Alnitak Nov 05 '15 at 01:44