0

I want a regular expression which can strip all comments from the HTML using PHP.

I saw some threads on stackoverflow like Regexp match strickly html comment string, but the regex provided there doesn't work. My PHP code outputs nothing after I apply the provided code.

I have written:

$regex = array('/<!--((.*)!(\[if))-->/Uis', "/[[:blank:]]+/");
$replaced_comment_in_html = preg_replace($regex, '', $html);

But it shows comments the HTML:

<!-- This is my test comment, which I want to be removed in HTML  -->
<!--[if lt IE 9]>
    <script src="something.js"></script>
<![endif]-->

It does not remove the comments that I want to be removed, and if I write the below regex, then it removes all comments (also the IE style and scripts, which are required on the page)

$regex = array('/<!--(.*)-->/Uis', "/[[:blank:]]+/");

Can someone help?

Community
  • 1
  • 1
Thompson
  • 1,954
  • 10
  • 33
  • 58
  • 1
    You really [should not parse XML/HTML with regex](http://stackoverflow.com/a/1732454/1883647). Instead, you should use [PHP's DOM extension](http://www.php.net/manual/en/intro.dom.php) to parse your markup string, and use that to remove comments, as is asked and answered in [this question](http://stackoverflow.com/q/6305643/1883647). – ajp15243 Mar 16 '14 at 06:43

1 Answers1

1

Use this regex:

<!--[^\[].*-->

This will not remove IE comments, but will remove other comments.

Use it like this:

$regex = array('/<!--[^\[].*-->/Uis', "/[[:blank:]]+/");
$replaced_comment_in_html = preg_replace($regex, '', $html);
Amit Joki
  • 58,320
  • 7
  • 77
  • 95