0

I have the following html snippet:

<h1 class="header" itemprop="name">Some text here<span class="nobr">

I would like to get the text between the html tags, I'm struggling with this for hours now, please help me! What regex would solve my problem?

Bart
  • 19,692
  • 7
  • 68
  • 77
traubisoda
  • 11
  • 1
  • 3
    Regular expressions should not be used for HTML parsing. Use a parser. – Chris Dargis Jun 22 '12 at 19:23
  • [In general it cannot be done.](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) But you could try `<[^>]*>([^<]*)<.*>` – Beta Jun 22 '12 at 19:25

3 Answers3

0

You should not use regex for that, but some HTML parser. As you didn't specify language, it is hard to help, but you will find it by googling...


If you need it just for this one case, you can use regex />(.*?)</

Ωmega
  • 42,614
  • 34
  • 134
  • 203
0

In Javascript you can access that info via:

document.getElementsByTagName("h1").item(0).textContent

or

document.getElementsByClassName("header").item(0).textContent
ZnArK
  • 1,533
  • 1
  • 12
  • 23
0

Like other's have said - you shouldn't be using regular expressions for parsing HTML. But with that aside the following will grab that text for you:

(?<=\>).+(?=\<)

m.edmondson
  • 30,382
  • 27
  • 123
  • 206