0

I have some html content: e.g.

<p>remove_this_content </p>
<p>something here <p>something here</p></p>
....
....

Basically I want to remove entire <p>remove_this_content </p>. I am not sure what is trailing remove_this_content. It may be a space or line break. I cannot tell in firebug.

I know I should not use regex to parse xml, but I don't know how to do it.

Somehow I found a way to get it working, but I don't know why it works.

$pattern = "/<p\b[^>]*>my_home_content_remove/i"; 
$tmp_content = preg_replace($pattern, '', $tmp_content);

While this is not working at all

$pattern = "/<p\b[^>]*>my_home_content_remove.*?<\/p>/i"; 
$tmp_content = preg_replace($pattern, '', $tmp_content);

Update

I resolved the issue by using domDocument in php based on How do you parse and process HTML/XML in PHP?

Community
  • 1
  • 1
kenpeter
  • 7,404
  • 14
  • 64
  • 95
  • 3
    Again, do not use regex. Use simple xml or simple html – Shakil Ahamed Jan 05 '14 at 07:10
  • See also: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Mike Jan 05 '14 at 07:11
  • possible duplicate of [How do you parse and process HTML/XML in PHP?](http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) – nhahtdh Jan 05 '14 at 07:29

0 Answers0