1

Possible Duplicate:
How to minify php page html output?

I mean strip out all the line breaks and spaces in html and puts it on one line.

I tried this function

public static function htmlCompress($html)
{
    preg_match_all('!(<(?:code|pre|script).*>[^<]+</(?:code|pre|script)>)!',$html,$pre);
    $html = preg_replace('!<(?:code|pre).*>[^<]+</(?:code|pre)>!', '#pre#', $html);
    $html = preg_replace('#<!–[^\[].+–>#', '', $html);
    $html = preg_replace('/[\r\n\t]+/', ' ', $html);
    $html = preg_replace('/>[\s]+</', '><', $html);
    $html = preg_replace('/[\s]+/', ' ', $html);
    if (!empty($pre[0])) {
        foreach ($pre[0] as $tag) {
            $html = preg_replace('!#pre#!', $tag, $html,1);
        }
    }
    return $html;
}

but sometimes appears symbols like "�" because of this string

$html = preg_replace('/[\s]+/', ' ', $html);

Why appears this symbol and how to compress html?

Community
  • 1
  • 1
Ildar
  • 798
  • 2
  • 14
  • 35
  • 4
    Use an existing library [html minifier](http://code.google.com/p/htmlcompressor/) – Ibu Jul 23 '11 at 21:52
  • That usually means there is a mismatch in your character sets, for instance, one is ASCII and the other UTF-8. – Jared Farrish Jul 23 '11 at 21:53
  • 2
    You achieve more by just gzipping the content. HTML minifaction is only useful for sites with massive userbases. – Gordon Jul 23 '11 at 21:53
  • @Ibu that's a Java library. Can be used in conjunction with PHP of course (the question's about PHP), just saying...a PHP solution probably makes more sense here. – Matt Browne Feb 26 '13 at 04:01
  • This function have a bug; line-breaks get damaged when pre include html-tags. – user706420 Feb 03 '15 at 08:41

3 Answers3

5

\s should not appear in square brackets, i.e. this is correct:

$html = preg_replace('/\s+/', ' ', $html);
Julian
  • 2,021
  • 16
  • 21
  • 1
    Wonderful and simple, thanks. It is the only one that does not break the code. One suggestion to add: $html = str_replace( '> <', '><', $html ); – Codebeat Jan 13 '13 at 01:42
4

That symbol means it's a foreign character and your particular font doesn't know what character it needs to use. You should look into multibyte-safe string functions and UTF-8 encoding and decoding

AlienWebguy
  • 76,997
  • 17
  • 122
  • 145
1

Look into output buffering: http://www.php.net/manual/en/ref.outcontrol.php and http://php.net/manual/en/function.gzcompress.php (if your apache server is configured to handle compression.) Any more then that will likely cause more overhead then you would gain.

So, use the ob_buffer to get your output as a string. Compress the string, and send it out.

Alan B. Dee
  • 5,490
  • 4
  • 34
  • 29