0

I wan to collect all text from a list of elements obtains using

var elements =document.body.getElementsByTagName("*");

What I've done so far:

var text = '';
for (var i = 0; i < elements.length; i++) {
  text = text + ' ' + elements[i].innerText
}

This will return duplicated text because it get the own text of each element plus its children's. I want to know if there is a way to get element's owntext using pure javasript?

innovatism
  • 379
  • 1
  • 4
  • 16
  • 2
    Try using index in elements. Example: elements[i].innerText – JGV Sep 08 '15 at 16:42
  • Would that help if there are nested elements with the same tag? – arcyqwerty Sep 08 '15 at 16:43
  • Sorry I've made the mistake not inserting index while asking, but the code I used has index already – innovatism Sep 08 '15 at 16:44
  • 1
    Similar discussion: http://stackoverflow.com/questions/4256339/javascript-how-to-loop-through-all-dom-elements-on-a-page – JGV Sep 08 '15 at 16:47
  • It says textContent will include also ` – innovatism Sep 08 '15 at 16:57
  • Yes, but you're leaving millions of FireFox users without content ... It's easy to check the tag name before using its content. – Teemu Sep 08 '15 at 16:59

1 Answers1

0

I think the issue is that nested matching elements of a particular tag are being counted twice. The solution is to check if we've already visited a parent element and to skip the child if that's the case.

var text = '';
var visited = [];
for (var i = 0; i < elements.length; i++) {
  var found = false;
  for (var e = elements[i]; e != null; e = e.parentNode) {
    if (visited.indexOf(e) > -1) {
      found = true;
      break;
    }
  }
  if (!found) {
    text = text + ' ' + elements[i].innerText;
    visited.push(elements[i]);
  }
}

http://jsfiddle.net/h8k0xx82/

arcyqwerty
  • 10,325
  • 4
  • 47
  • 84
  • Note that I updated the solution with a conditional `if (!found)` which should be needed. Glad to be of help! :) – arcyqwerty Sep 08 '15 at 16:57