innerText/textContent vs. retrieving each text node

Question

I've heard that using el.innerText||el.textContent can yield unreliable results, and that's why I've always insisted on using the following function in the past:

function getText(node) {

    if (node.nodeType === 3) {
        return node.data;
    }

    var txt = '';

    if (node = node.firstChild) do {
        txt += getText(node);
    } while (node = node.nextSibling);

    return txt;

}

This function goes through all nodes within an element and gathers the text of all text nodes, and text within descendants:

E.g.

<div id="x">foo <em>foo...</em> foo</div>

Result:

getText(document.getElementById('x')); // => "foo foo... foo"

I'm quite sure there are issues with using innerText and textContent, but I've not been able to find a definitive list anywhere and I am starting to wonder if it's just hearsay.

Can anyone offer any information about the possibly lacking reliability of textContent/innerText?

EDIT: Found this great answer by Kangax -- 'innerText' works in IE, but not in Firefox

So simple and so useful! How about `document.TEXT_NODE` instead of `3`? Is that not supported in older browsers? — stackunderflow, Dec 15 '13 at 16:06

score 33 · Accepted Answer · answered Apr 16 '10 at 15:02

33

It's all about endlines and whitespace - browsers are very inconsistent in this regard, especially so in Internet Explorer. Doing the traversal is a sure-fire way to get identical results in all browsers.

answered Apr 16 '10 at 15:02

John Resig

35,521
3
29
19

Thanks John. I'm surprised that there's pretty much no mention of this problem elsewhere on the net... even QuirksMode doesn't mention it. – James Apr 16 '10 at 15:29
14

Traversal doesn't guarantee the same results in all browsers. For example, in IE (unlike other browsers) the content of a ` – Tim Down Sep 14 '10 at 14:24

innerText/textContent vs. retrieving each text node

1 Answers1

Linked

Related