0

Is there a trick to getting all HTML elements using regular expressions? Take this snippet of HTML, for instance:

<div>
<p>test
<span>blub</span></p>
</div>

in correct order Like:

array(
0=>'<div>
<p>test
<span>blub</span></p>
</div>',

1=>'<p>test
<span>blub</span></p>'

2=>'<span>blub</span>'
)

I thought of something with

 (<([A-z]+)[^>]*>.*?</\2>)
Michael Petrotta
  • 59,888
  • 27
  • 145
  • 179
dazzafact
  • 2,570
  • 3
  • 30
  • 49
  • 4
    No. [You cannot parse HTML with regex!](http://stackoverflow.com/a/1732454/1048572) – Bergi Dec 19 '12 at 23:35
  • How about using an HTML parser? http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Hugo Dozois Dec 19 '12 at 23:35
  • 2
    [The pony, he comes...](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – Andrew Whitaker Dec 19 '12 at 23:35

2 Answers2

1

Take a look at this question: regex match html element with html children

You can't really parse HTML with regular expressions. Use PHP or some other language to parse your HTML.

Community
  • 1
  • 1
AntonNiklasson
  • 1,719
  • 1
  • 15
  • 28
0

Quick and dirty

<[^>]+>

Don't expect this to work when you have string with '>' inside them.

Derek Schrock
  • 346
  • 1
  • 7