1

I just want to remove comments and white space from an html string before saving in DB. I don't want it to be repaired and add head tags etc.

I've spent hours searching for this but can't find anything, can someone who has done this tell me what config I need and which php tidy function will just "minify" and not try and make a valid html document from an html string?

RISC OS
  • 149
  • 11

2 Answers2

0

Below example may help you:

<?php
function html2txt($document){
$search = array('@<script[^>]*?>.*?</script>@si',  // Strip out javascript
               '@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags
               '@<style[^>]*?>.*?</style>@siU',    // Strip style tags properly
               '@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments including CDATA
);
$text = preg_replace($search, '', $document);
return $text;
}
?> 

You can get more info on http://php.net/manual/en/function.strip-tags.php

Suresh Kamrushi
  • 15,627
  • 13
  • 75
  • 90
0

Can you try this,

below function is used to remove unwanted HTML comments & WhiteSpace,

      function remove_html_comments_white_spaces($content = '') {    

                  $content = preg_replace('~>\s+<~', '><', $content);
                  $content = preg_replace('/<!--(.|\s)*?-->/', '', $content);

            return $content;
        }

Even if you want to remove tags, then you can use PHP inbuilt function strip_tags();

Krish R
  • 22,583
  • 7
  • 50
  • 59