158

I am looking for a php script or class that can minify my php page html output like google page speed does.

How can I do this?

BenMorel
  • 34,448
  • 50
  • 182
  • 322
m3tsys
  • 3,939
  • 6
  • 31
  • 45
  • 15
    One-liner based on @RakeshS answer: `ob_start(function($b){return preg_replace(['/\>[^\S ]+/s','/[^\S ]+\','<','\\1'],$b);});` – Francisco Presencia Jan 24 '14 at 01:46
  • 7
    @FranciscoPresencia That's a really bad thing to do. You're breaking script tags, pre tags, etc. – Brad Jan 13 '15 at 23:05
  • 1
    That's true, as noted in his answer comments it does not work with `
    ` or `` tags since they need the whitespace for proper structure. However, the `
    – Francisco Presencia Jan 14 '15 at 08:17

15 Answers15

237

CSS and Javascript

Consider the following link to minify Javascript/CSS files: https://github.com/mrclay/minify

HTML

Tell Apache to deliver HTML with GZip - this generally reduces the response size by about 70%. (If you use Apache, the module configuring gzip depends on your version: Apache 1.3 uses mod_gzip while Apache 2.x uses mod_deflate.)

Accept-Encoding: gzip, deflate

Content-Encoding: gzip

Use the following snippet to remove white-spaces from the HTML with the help ob_start's buffer:

<?php

function sanitize_output($buffer) {

    $search = array(
        '/\>[^\S ]+/s',     // strip whitespaces after tags, except space
        '/[^\S ]+\</s',     // strip whitespaces before tags, except space
        '/(\s)+/s',         // shorten multiple whitespace sequences
        '/<!--(.|\s)*?-->/' // Remove HTML comments
    );

    $replace = array(
        '>',
        '<',
        '\\1',
        ''
    );

    $buffer = preg_replace($search, $replace, $buffer);

    return $buffer;
}

ob_start("sanitize_output");

?>
Community
  • 1
  • 1
Rakesh Sankar
  • 9,337
  • 4
  • 41
  • 66
  • 62
    This is a good function but be wary of it if you use **PRE** tags, sometimes newlines will be removed there. – fedmich Mar 03 '13 at 02:30
  • 2
    Where should this code be, at the top of your script or the bottom? – jdepypere Sep 04 '13 at 15:47
  • 1
    @arbitter the function should be defined before you call that function and I would suggest you to read [this](http://php.net/manual/en/function.ob-start.php) description & check the example script for more understanding. – Rakesh Sankar Sep 06 '13 at 05:19
  • 8
    You can also use the Minify_HTML class from that Minify library (`$content = \Minify_HTML::minify($content);`, you can even add callbacks to js/css minifiers for inline code). See https://github.com/mrclay/minify/blob/master/min/lib/Minify/HTML.php – Barryvdh Aug 18 '14 at 14:47
  • 29
    This also breaks inline JavaScript (i.e. in ` – Konstantin Pereiaslov Sep 07 '14 at 13:38
  • 8
    this will remove spaces from textarea, pre, input, img also this breaks inline javascripts. if someone is not happy to use bulky class with DOM parsing [this solution](https://gist.github.com/tovic/d7b310dea3b33e4732c0) based on regexp works great – Peter Oct 12 '15 at 09:26
  • 1
    There's a improved version of this code that treats PRE tags with respect: http://stackoverflow.com/a/10423788/1696898 – Osvaldas Sep 11 '16 at 15:48
  • I would stick to GZipping for front end output and only consider the minifier for caching where memory is a concern. If your markup is full of spaces, you're better off writing more compact HTML from the start. – Juniper Jones Jul 17 '18 at 03:30
  • And what about **text in textarea tag**? While we could have some newline to keep! – Meloman Jan 30 '19 at 09:15
  • Not working for a Large number of HTML pages. Returns Null when I pass a Big Page to him ... more likely when I Pass HTML tables with 10000 Rows, He sends me output Null. – Umer Rasheed Jun 02 '21 at 06:13
28

Turn on gzip if you want to do it properly. You can also just do something like this:

$this->output = preg_replace(
    array(
        '/ {2,}/',
        '/<!--.*?-->|\t|(?:\r?\n[ \t]*)+/s'
    ),
    array(
        ' ',
        ''
    ),
    $this->output
);

This removes about 30% of the page size by turning your html into one line, no tabs, no new lines, no comments. Mileage may vary

dogmatic69
  • 7,574
  • 4
  • 31
  • 49
  • 1
    Doing both would bring down the amount of bytes needed even further. – Wander Nauta Jun 03 '11 at 09:49
  • 1
    actually doing both is the same as doing gzip, on a 700kb page gzip will take it down to about 400kb and the preg_replace() about 450kb (all depending on the content) both will be like 399kb as gzip removes the spaces the same and then compresses it – dogmatic69 Jun 03 '11 at 09:52
  • i already use `if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')) ob_start("ob_gzhandler"); else ob_start();` Is it necessary to add this? – m3tsys Jun 03 '11 at 09:58
  • gzip needs to be set on your server, if it is working you will have something like http://www.seo.co.uk/seo-news/wp-content/uploads/2010/10/http-gzip-header.png – dogmatic69 Jun 03 '11 at 10:20
  • @dogmatic69 gzip does not remove spaces. – smdrager Apr 14 '12 at 22:22
  • 19
    This could be potentially dangerous, since it also would remove IE conditionals... - you would need to change it to // – Katai Nov 08 '12 at 13:37
  • 4
    Does not work, removing too much, mess up the code. Before it was W3C valid and after this it's not. – Codebeat Jan 13 '13 at 01:30
  • 3
    Unfortunately, it also breaks Javascript code, like for generating more complex implementations of Google Maps – which is exactly I would need such a function for. – richey Feb 03 '13 at 01:04
  • 1
    a specification to my comment (couldn't edit it anymore): it can also break Javascript code if it contains comments only separated by single-line delimiters "// comment_here". Such comments should thus be removed or embedded otherwise before using this function. – richey Feb 03 '13 at 01:28
  • This will remove IE conditional comments, see this page for handling conditional comments correctly: http://stackoverflow.com/questions/1084741/regexp-to-strip-html-comments#answer-5679394 – Andrew May 09 '14 at 00:57
  • 1
    @dogmatic69 that's incorrect. Gzip does not simply *remove the spaces* but it looks for repetition (whatever it may be, even spaces). Gzip and minification are complementary. – fregante Sep 15 '14 at 02:16
27

This work for me.

function minify_html($html)
{
   $search = array(
    '/(\n|^)(\x20+|\t)/',
    '/(\n|^)\/\/(.*?)(\n|$)/',
    '/\n/',
    '/\<\!--.*?-->/',
    '/(\x20+|\t)/', # Delete multispace (Without \n)
    '/\>\s+\</', # strip whitespaces between tags
    '/(\"|\')\s+\>/', # strip whitespaces between quotation ("') and end tags
    '/=\s+(\"|\')/'); # strip whitespaces between = "'

   $replace = array(
    "\n",
    "\n",
    " ",
    "",
    " ",
    "><",
    "$1>",
    "=$1");

    $html = preg_replace($search,$replace,$html);
    return $html;
}
Mohamad Hamouday
  • 2,070
  • 23
  • 20
26

I've tried several minifiers and they either remove too little or too much.

This code removes redundant empty spaces and optional HTML (ending) tags. Also it plays it safe and does not remove anything that could potentially break HTML, JS or CSS.

Also the code shows how to do that in Zend Framework:

class Application_Plugin_Minify extends Zend_Controller_Plugin_Abstract {

  public function dispatchLoopShutdown() {
    $response = $this->getResponse();
    $body = $response->getBody(); //actually returns both HEAD and BODY

    //remove redundant (white-space) characters
    $replace = array(
        //remove tabs before and after HTML tags
        '/\>[^\S ]+/s'   => '>',
        '/[^\S ]+\</s'   => '<',
        //shorten multiple whitespace sequences; keep new-line characters because they matter in JS!!!
        '/([\t ])+/s'  => ' ',
        //remove leading and trailing spaces
        '/^([\t ])+/m' => '',
        '/([\t ])+$/m' => '',
        // remove JS line comments (simple only); do NOT remove lines containing URL (e.g. 'src="http://server.com/"')!!!
        '~//[a-zA-Z0-9 ]+$~m' => '',
        //remove empty lines (sequence of line-end and white-space characters)
        '/[\r\n]+([\t ]?[\r\n]+)+/s'  => "\n",
        //remove empty lines (between HTML tags); cannot remove just any line-end characters because in inline JS they can matter!
        '/\>[\r\n\t ]+\</s'    => '><',
        //remove "empty" lines containing only JS's block end character; join with next line (e.g. "}\n}\n</script>" --> "}}</script>"
        '/}[\r\n\t ]+/s'  => '}',
        '/}[\r\n\t ]+,[\r\n\t ]+/s'  => '},',
        //remove new-line after JS's function or condition start; join with next line
        '/\)[\r\n\t ]?{[\r\n\t ]+/s'  => '){',
        '/,[\r\n\t ]?{[\r\n\t ]+/s'  => ',{',
        //remove new-line after JS's line end (only most obvious and safe cases)
        '/\),[\r\n\t ]+/s'  => '),',
        //remove quotes from HTML attributes that does not contain spaces; keep quotes around URLs!
        '~([\r\n\t ])?([a-zA-Z0-9]+)="([a-zA-Z0-9_/\\-]+)"([\r\n\t ])?~s' => '$1$2=$3$4', //$1 and $4 insert first white-space character found before/after attribute
    );
    $body = preg_replace(array_keys($replace), array_values($replace), $body);

    //remove optional ending tags (see http://www.w3.org/TR/html5/syntax.html#syntax-tag-omission )
    $remove = array(
        '</option>', '</li>', '</dt>', '</dd>', '</tr>', '</th>', '</td>'
    );
    $body = str_ireplace($remove, '', $body);

    $response->setBody($body);
  }
}

But note that when using gZip compression your code gets compressed a lot more that any minification can do so combining minification and gZip is pointless, because time saved by downloading is lost by minification and also saves minimum.

Here are my results (download via 3G network):

 Original HTML:        150kB       180ms download
 gZipped HTML:          24kB        40ms
 minified HTML:        120kB       150ms download + 150ms minification
 min+gzip HTML:         22kB        30ms download + 150ms minification
Radek Pech
  • 3,032
  • 1
  • 24
  • 29
  • 5
    yup, I agree that it is seemingly pointless, but it can score you one or two precious points in pagespeed for google, which is relevant for your google ranking. Your code is perfect for stripping the unneeded spaces. Thanks :-) – Tschallacka Jan 06 '16 at 08:26
  • 1
    this works great, had issues with ="/" so I took the / out of '~([\r\n\t ])?([a-zA-Z0-9]+)="([a-zA-Z0-9_/\\-]+)"([\r\n\t ])?~s' => '$1$2=$3$4', //$1 and $4 insert first white-space character found before/after attribute – Will Bowman Jan 15 '16 at 03:49
  • Well as it happens I'm not looking to remove whitespace just to speed things up, but rather because that's how HTML *should* be in order for things not to totally screw up, like inline block elements, but I'm also looking for one capable of ignoring things that need to have one space before or after (bold elements in a text block for example). – Deji Feb 10 '16 at 09:15
  • I found a problem with certain Jquery/Foundation stuff ... unless I commented out the following lines: //remove "empty" lines containing only JS's block end character; join with next line (e.g. "}\n}\n" --> "}}" // '/}[\r\n\t ]+/s' => '}', // '/}[\r\n\t ]+,[\r\n\t ]+/s' => '},', – Ian Jan 08 '18 at 13:30
  • 1
    If you use server side caching (for me Smarty V3) the min+gzip is a good solution excepted at first call. So, if after the 15th call, it will be intersting for server time. rule = 40x15 = (30x15 + 150) But for the second call it will already be faster for visitor. – Meloman Jan 30 '19 at 09:10
  • There is a problem with the sequence `), `, e.g. `(CEO), 2015`. The space in between gets falsely removed by second-last regex ("remove new-line after JS's line end ..."). – Ti Hausmann Sep 09 '22 at 09:58
22

All of the preg_replace() solutions above have issues of single line comments, conditional comments and other pitfalls. I'd recommend taking advantage of the well-tested Minify project rather than creating your own regex from scratch.

In my case I place the following code at the top of a PHP page to minify it:

function sanitize_output($buffer) {
    require_once('min/lib/Minify/HTML.php');
    require_once('min/lib/Minify/CSS.php');
    require_once('min/lib/JSMin.php');
    $buffer = Minify_HTML::minify($buffer, array(
        'cssMinifier' => array('Minify_CSS', 'minify'),
        'jsMinifier' => array('JSMin', 'minify')
    ));
    return $buffer;
}
ob_start('sanitize_output');
Andrew
  • 3,335
  • 5
  • 36
  • 45
4

Create a PHP file outside your document root. If your document root is

/var/www/html/

create the a file named minify.php one level above it

/var/www/minify.php

Copy paste the following PHP code into it

<?php
function minify_output($buffer){
    $search = array('/\>[^\S ]+/s','/[^\S ]+\</s','/(\s)+/s');
    $replace = array('>','<','\\1');
    if (preg_match("/\<html/i",$buffer) == 1 && preg_match("/\<\/html\>/i",$buffer) == 1) {
        $buffer = preg_replace($search, $replace, $buffer);
    }
    return $buffer;
}
ob_start("minify_output");?>

Save the minify.php file and open the php.ini file. If it is a dedicated server/VPS search for the following option, on shared hosting with custom php.ini add it.

auto_prepend_file = /var/www/minify.php

Reference: http://websistent.com/how-to-use-php-to-minify-html-output/

Pang
  • 9,564
  • 146
  • 81
  • 122
Avi Tyagi
  • 59
  • 1
  • 2
3

you can check out this set of classes: https://code.google.com/p/minify/source/browse/?name=master#git%2Fmin%2Flib%2FMinify , you'll find html/css/js minification classes there.

you can also try this: http://code.google.com/p/htmlcompressor/

Good luck :)

Teodor Sandu
  • 1,348
  • 1
  • 20
  • 31
2

First of all gzip can help you more than a Html Minifier

  1. With nginx:

    gzip on;
    gzip_disable "msie6";
    
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_buffers 16 8k;
    gzip_http_version 1.1;
    gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
    
  2. With apache you can use mod_gzip

Second: with gzip + Html Minification you can reduce the file size drastically!!!

I've created this HtmlMinifier for PHP.

You can retrieve it through composer: composer require arjanschouten/htmlminifier dev-master.

There is a Laravel service provider. If you're not using Laravel you can use it from PHP.

// create a minify context which will be used through the minification process
$context = new MinifyContext(new PlaceholderContainer());
// save the html contents in the context
$context->setContents('<html>My html...</html>');
$minify = new Minify();
// start the process and give the context with it as parameter
$context = $minify->run($context);

// $context now contains the minified version
$minifiedContents = $context->getContents();

As you can see you can extend a lot of things in here and you can pass various options. Check the readme to see all the available options.

This HtmlMinifier is complete and safe. It takes 3 steps for the minification process:

  1. Replace critical content temporary with a placeholder.
  2. Run the minification strategies.
  3. Restore the original content.

I would suggest that you cache the output of you're views. The minification process should be a one time process. Or do it for example interval based.

Clear benchmarks are not created at the time. However the minifier can reduce the page size with 5-25% based on the your markup!

If you want to add you're own strategies you can use the addPlaceholder and the addMinifier methods.

ArjanSchouten
  • 1,360
  • 9
  • 23
  • Thanks for the library. The instructions don't say what PHP files I need to include. I'll eventually figure it out, but that is something you should probably add on your website. – rosewater Oct 07 '15 at 16:04
  • Looks like it still requires Illuminate\Support\Collection. Its not a stand alone PHP solution. – rosewater Oct 07 '15 at 16:17
  • Thanks for the feedback! It is a [composer](https://getcomposer.org/doc/01-basic-usage.md) package. [I've updated the readme](https://github.com/ArjanSchouten/HtmlMinifier/commit/a66785deb00b2e7692ae7ccbdb222ba252972a49) with the following rule: ```require __DIR__ . '/vendor/autoload.php';``` The only thing you've to do is including this file. This is generated by composer! – ArjanSchouten Oct 07 '15 at 16:38
2

I have a GitHub gist contains PHP functions to minify HTML, CSS and JS files → https://gist.github.com/taufik-nurrohman/d7b310dea3b33e4732c0

Here’s how to minify the HTML output on the fly with output buffer:

<?php

include 'path/to/php-html-css-js-minifier.php';

ob_start('minify_html');

?>

<!-- HTML code goes here ... -->

<?php echo ob_get_clean(); ?>
Taufik Nurrohman
  • 3,329
  • 24
  • 39
2

If you want to remove all new lines in the page, use this fast code:

ob_start(function($b){
if(strpos($b, "<html")!==false) {
return str_replace(PHP_EOL,"",$b);
} else {return $b;}
});
Sos.
  • 914
  • 10
  • 14
  • `PHP_EOL` is much better than what I used `array("\r\n","\n", "\r")` as it would remove new lines from contents of textarea, whereas `PHP_EOL` doesn't. Thanks! – Moseleyi Mar 18 '23 at 22:06
2

You can look into HTML TIDY - http://uk.php.net/tidy

It can be installed as a PHP module and will (correctly, safely) strip whitespace and all other nastiness, whilst still outputting perfectly valid HTML / XHTML markup. It will also clean your code, which can be a great thing or a terrible thing, depending on how good you are at writing valid code in the first place ;-)

Additionally, you can gzip the output using the following code at the start of your file:

ob_start('ob_gzhandler');
Rudi Visser
  • 21,350
  • 5
  • 71
  • 97
  • the problem is that the site will be hosted on shared and i will not have access to install such modules. – m3tsys Jun 03 '11 at 10:00
  • Chances are, it will already be installed. Check `phpinfo()`... At the very least `zlib` should be installed allowing you to use the `ob_gzhandler`. – Rudi Visser Jun 03 '11 at 10:01
  • i already use `if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')) ob_start("ob_gzhandler"); else ob_start();` isn't it the same thing? – m3tsys Jun 03 '11 at 10:03
  • 2
    Yes it is, you really don't need the `else ob_start()` part, nor the gzip check... `ob_gzhandler` detects whether the browser supports any compression method internally. Simply having `ob_start('ob_gzhandler');` will suffice. – Rudi Visser Jun 03 '11 at 10:10
  • Any possibility of TIDY being slower than the other answers here because of the extra parsing overhead? Might be good for development - then you can correct those HTML errors in the actual source code - but I question if this is the best choice for production. – Matt Browne Feb 26 '13 at 03:57
  • @MattB. In an ideal production implementation, you would be caching the results of the minification from TIDY anyway, so it's not something that should be high up on the list of concerns :) – Rudi Visser Feb 26 '13 at 08:47
  • @RudiVisser Ah, good point. Of course not every page's HTML can be cached, depending on how up-to-date it needs to be, but most pages can, so you're right that it doesn't matter. Btw the Derby framework for node.js has an interesting approach where the HTML is stripped of any tags (like ) and attribute quotes that aren't strictly required by the spec, but I'm sure there are very few apps where you'd really need to go that far with HTML minification. – Matt Browne Feb 26 '13 at 13:27
0

Thanks to Andrew. Here's what a did to use this in cakePHP:

  1. Download minify-2.1.7
  2. Unpack the file and copy min subfolder to cake's Vendor folder
  3. Creates MinifyCodeHelper.php in cake's View/Helper like this:

    App::import('Vendor/min/lib/Minify/', 'HTML');
    App::import('Vendor/min/lib/Minify/', 'CommentPreserver');
    App::import('Vendor/min/lib/Minify/CSS/', 'Compressor');
    App::import('Vendor/min/lib/Minify/', 'CSS');
    App::import('Vendor/min/lib/', 'JSMin');
    class MinifyCodeHelper extends Helper {
        public function afterRenderFile($file, $data) {
            if( Configure::read('debug') < 1 ) //works only e production mode
                $data = Minify_HTML::minify($data, array(
                    'cssMinifier' => array('Minify_CSS', 'minify'),
                    'jsMinifier' => array('JSMin', 'minify')
                ));
            return $data;
        }
    }
    
  4. Enabled my Helper in AppController

    public $helpers = array ('Html','...','MinifyCode');

5... Voila!

My conclusion: If apache's deflate and headers modules is disabled in your server your gain is 21% less size and 0.35s plus in request to compress (this numbers was in my case).

But if you had enable apache's modules the compressed response has no significant difference (1.3% to me) and the time to compress is the samne (0.3s to me).

So... why did I do that? 'couse my project's doc is all in comments (php, css and js) and my final user dont need to see this ;)

Community
  • 1
  • 1
Bocapio
  • 109
  • 4
0

You can use a well tested Java minifier like HTMLCompressor by invoking it using passthru (exec).
Remember to redirect console using 2>&1

This however may not be useful, if speed is a concern. I use it for static php output

Community
  • 1
  • 1
Ujjwal Singh
  • 4,908
  • 4
  • 37
  • 54
0

The easiest possible way would be using strtr and removing the whitespace. That being said don't use javascript as it might break your code.

$html_minify = fn($html) => strtr($html, [PHP_EOL => '', "\t" => '', '  ' => '', '< ' => '<', '> ' => '>']);

echo $html_minify(<<<HTML
<li class="flex--item">
    <a href="#"
        class="-marketing-link js-gps-track js-products-menu"
        aria-controls="products-popover"
        data-controller="s-popover"
        data-action="s-popover#toggle"
        data-s-popover-placement="bottom"
        data-s-popover-toggle-class="is-selected"
        data-gps-track="top_nav.products.click({location:2, destination:1})"
        data-ga="[&quot;top navigation&quot;,&quot;products menu click&quot;,null,null,null]">
        Products
    </a>
</li>
HTML);

// Response (echo): <li class="flex--item"><a href="#"class="-marketing-link js-gps-track js-products-menu"aria-controls="products-popover"data-controller="s-popover"data-action="s-popover#toggle"data-s-popover-placement="bottom"data-s-popover-toggle-class="is-selected"data-gps-track="top_nav.products.click({location:2, destination:1})"data-ga="[&quot;top navigation&quot;,&quot;products menu click&quot;,null,null,null]">Products</a></li>
Tesla
  • 169
  • 1
  • 6
0

This work for me on php apache wordpress + w total cache + amp put it on the top of php page

<?Php
if(substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')){
    ob_start ('ob_html_compress'); 
}else{
    ob_start(); 
}
function ob_html_compress($buf) {
$search = array('/\>[^\S ]+/s', '/[^\S ]+\</s', '/(\s)+/s', '/<!--(.|\s)*?-->/');
$replace = array('>', '<', '\\1', '');
$buf = preg_replace($search, $replace, $buf);
return $buf;

return str_replace(array("\n","\r","\t"),'',$buf);
}
?>

then the rest of ur php or html stuff

Muna
  • 1