I have some html content: e.g.
<p>remove_this_content </p>
<p>something here <p>something here</p></p>
....
....
Basically I want to remove entire <p>remove_this_content </p>
. I am not sure what is trailing remove_this_content. It may be a space or line break. I cannot tell in firebug.
I know I should not use regex to parse xml, but I don't know how to do it.
Somehow I found a way to get it working, but I don't know why it works.
$pattern = "/<p\b[^>]*>my_home_content_remove/i";
$tmp_content = preg_replace($pattern, '', $tmp_content);
While this is not working at all
$pattern = "/<p\b[^>]*>my_home_content_remove.*?<\/p>/i";
$tmp_content = preg_replace($pattern, '', $tmp_content);
Update
I resolved the issue by using domDocument in php based on How do you parse and process HTML/XML in PHP?