0

I want to take paragraph or div from html, but if it don't contain form. For example:

<p><form>I don't want this text</form>and not this text</p>
<p>I want to take this text</p>

I have working variant, without form filter.

/(?:<(?:p|div)[^>]*>)(.*)(?:<\/(?:p|div)>)/iu

And not working variant with filter

/(?:<(?:p|div)[^>]*>)((?:.(?!<form))*)(?:<\/(?:p|div)>)/iu

Can you help me?

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
  • What exactly is not working? In which cases does it give the wrong result (and what is the expected result in those cases)? –  Apr 15 '15 at 15:52
  • 1
    http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Ignacio Vazquez-Abrams Apr 15 '15 at 16:08

1 Answers1

1

Warning: parse HTML with Regexp has always been, and will always be a bad idea.

Here is a slightly modified version of your regex:

/(?:<(?:p|div)[^>]*>)(?!.*\<form\>)(.*)(?:<\/(?:p|div)>)/iu

I improved it to allow you to catch any paragraph containing the word "form (and not the tag). Try it with this test:

<p><form>I don't want this text</form>and not this text</p>
<p>I want to take this text even if it contains the "form" word!</p>
<p>I want to take this text</p>
zessx
  • 68,042
  • 28
  • 135
  • 158