0

I want to convert html of this kind:

<h2>heading1</h2>
something1
<h2>heading2</h2>
something2
...

to

<li>
<div class="a">
heading1
</div>
<div class="b">
something1
</div>
</li>
<li>
<div class="a">
heading2
</div>
<div class="b">
something2
</div>
</li>
...

I think there should be some proper regexp (in PHP) for this job. I tried /<h2>(.*?)<\/h2>(.*?)(<h2>)?/s but that doesn't work.

Alex Velickiy
  • 541
  • 1
  • 7
  • 22
  • You can't use RegEx to process Xml/Html ... http://stackoverflow.com/a/335446/5040941 – 3-14159265358979323846264 Jul 10 '15 at 11:55
  • 2
    How did you come to the conclusion that regex is the right tool for this task?! 'Cause [it isn't.](http://stackoverflow.com/a/1732454/418066) – Biffen Jul 10 '15 at 11:55
  • 4
    The famous answer: http://stackoverflow.com/a/1732454/993169 – Paul Bain Jul 10 '15 at 11:55
  • Are you sure you need to did using regex only if not in PHP Using the class `DOMDocument` easy to did this. – gvgvgvijayan Jul 10 '15 at 11:56
  • It's a shame someone can't just fix RegEx so it can do this job. (Joke). – 3-14159265358979323846264 Jul 10 '15 at 11:56
  • Use anything to parse it that isn't regex. – Rob Foley Jul 10 '15 at 11:57
  • 1
    I would also add that removing heading tags decrease the quality of your website referencement (see [Google's search engine optimization starter guide](http://static.googleusercontent.com/media/www.google.com/fr//webmasters/docs/search-engine-optimization-starter-guide.pdf), **Use heading tags appropriately**). – Anwar Jul 10 '15 at 12:02
  • @paul-bain haha, this is funny) – Alex Velickiy Jul 10 '15 at 12:05
  • Your regex didn't work, because `(

    )?` if present is consumed already making it impossible to match the subsequent `

    `. Using a [lookahead](http://www.regular-expressions.info/lookaround.html) `(?=`...

    – Jonny 5 Jul 10 '15 at 12:32

0 Answers0