4

I did this:

var blah = document.getElementById('id').getElementsByClassName('class')[0].innerHTML;

Now I have this in bar:

<a class="title" href="http://www.example.com/" tabindex="1">Some text goes here</a> <span class="domain">(<a href="/domain/foobar.co.uk/">foobar.co.uk</a>)</span>

I want to read the string "Some text goes here" from the HTML using JS (no jQuery). I don't have access to the site's HTML. I'm parsing a webpage to inject JS for a browser extension.

Will I just have to parse it as a string and find my text from between > and < or is there a way to parse innerHTML in JS?

Antrikshy
  • 2,918
  • 4
  • 31
  • 65

2 Answers2

7

Basic HTML markup that I am assuming you have:

<div id="id">
    <div class="class">
        <a class="title" href="http://www.example.com/" tabindex="1">Some text goes here</a> <span class="domain">(<a href="/domain/foobar.co.uk/">foobar.co.uk</a>)</span>
    </div>
</div>

So select the anchor and read the text

var theAnchorText = document.getElementById('id').getElementsByClassName('class')[0].getElementsByTagName("a")[0].textContent;

if you need to support IE8

var theAnchor = document.getElementById('id').getElementsByClassName('class')[0].getElementsByTagName("a")[0];
var theAnchorText = theAnchor.textContent || theAnchor.innerText;

and if you are using a modern browser, querySelector makes it a lot cleaner

var theAnchorText = document.querySelector("#id .class a").textContent;
epascarello
  • 204,599
  • 20
  • 195
  • 236
0

You could approach this two ways. A regexp or textContent on a temp DOM element:

var foo = "<b>bar</b>";

function regexpStrip(str) {
  return str.replace(/<[^>]*>/g, '');
}

function parseViaDOM(str) {
  var el = document.createElement('div');
  el.innerHTML = str;
  return el.textContent;
}

console.log(regexpStrip(foo)); // => "bar"
console.log(parseViaDOM(foo)); // => "bar"
Sukima
  • 9,965
  • 3
  • 46
  • 60
  • 1
    Reg Exp with HTML NO NO NO! http://stackoverflow.com/questions/590747/using-regular-expressions-to-parse-html-why-not – epascarello Aug 01 '14 at 13:55
  • 1
    @epascarello: wrong link. This is the right one to answer this kind of questions: http://stackoverflow.com/a/1732454/648265 – ivan_pozdeev Aug 01 '14 at 14:12
  • Agreed, Regexp is bad in HTML. I was under the impression the sample string he gave above was all there was in which case a regexp strip would work. But for anything more regexp would be evil bad :imp: – Sukima Aug 01 '14 at 14:20
  • @ivan_pozdeev Ah, normally I google the right one, copied the url without looking :) – epascarello Aug 01 '14 at 14:30