4

Even Stack Overflow doesn't compress their HTML. Is it recommended to compress HTML? As far as I've seen, it looks like Google is the only one.... (view the source). Why isn't this standard practice?

Lance
  • 75,200
  • 93
  • 289
  • 503
  • 1
    See http://stackoverflow.com/questions/1306792/why-minify-assets-and-not-the-markup and http://stackoverflow.com/questions/2359484/why-do-many-sites-minify-css-and-javascript-but-not-html-closed – Dominic Rodger Mar 11 '10 at 12:49
  • how do you know where similar questions are so quickly?! – Lance Mar 11 '10 at 12:50
  • Mainly good choice of search keywords I think. Searching for "Minify HTML" will find all the links he referenced. – Pekka Mar 11 '10 at 12:55
  • it helps when you've answered one of them yourself... (http://stackoverflow.com/questions/2359484/why-do-many-sites-minify-css-and-javascript-but-not-html/2359508#2359508) – Dominic Rodger Mar 11 '10 at 13:05

5 Answers5

12

I think you are confusing the source code minification of HTML, and GZIP compression. The latter is quite common (for example using mod_gzipon Apache, article here) and should be enough in most cases. It is totally internal between the server and the browser, you can't see it in the source code.

Actual minification of the HTML is not really worth doing except for sites where a saved byte can mean tens of thousands of dollars in traffic savings (like for Google.)

Pekka
  • 442,112
  • 142
  • 972
  • 1,088
  • Minifaction also makes it harder to steal or reverse engineer. – Konrad Garus Mar 11 '10 at 13:47
  • 2
    @Konrad Don't confuse minification with obfuscation. Any decent IDE that can format source code will turn minified code back to cleanly formatted with no effort. – Gordon Apr 08 '10 at 10:42
  • 1
    @Konrad Your point is valid but not for HTML IMO. It's not possible to minify (or obfuscate) HTML in a way that can't be restored using a simple source code formatter as far as I can see. It's different with CSS and JS, though. – Pekka Apr 08 '10 at 10:43
  • There isn't really anything to gain by reverse engineering HTML code. – Lotus Notes Jul 02 '10 at 19:28
2

HTML minification apprently doesn't matter that much for Stackoverflow. I did a little test based on the HTML source of the frontpage.

Raw content length: 207454 bytes
Gzipped content length: 30915 bytes
Trimmed content length: 176354 bytes
Trimmed and gzipped content length: 29658 bytes

SO already uses GZIP compression, so trimming whitespace (actually, HTML minification, or "HTML compression" as you call it) would save "only" around 1KB of bandwidth per response. For gigants with over 1 million pageviews per day HTML minification would already save over 1GB of bandwidth per day (actually, SO would save that much as well). Google serves billions of pageviews per day and every byte of difference would save gigabytes per day.

FWIW, I used this simple quick'n'dirty Java application to test it:

package com.stackoverflow.q2424952;

import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;
import java.util.zip.GZIPOutputStream;

public class Test {

    public static void main(String... args) throws IOException {
        InputStream input = new URL("http://stackoverflow.com").openStream();
        byte[] raw = raw(input);
        System.out.println("Raw content length: " + raw.length + " bytes");
        byte[] gzipped = gzip(new ByteArrayInputStream(raw));
        System.out.println("Gzipped content length: " + gzipped.length + " bytes");
        byte[] trimmed = trim(new ByteArrayInputStream(raw));
        System.out.println("Trimmed content length: " + trimmed.length + " bytes");
        byte[] trimmedAndGzipped = gzip(new ByteArrayInputStream(trimmed));
        System.out.println("Trimmed and gzipped content length: " + trimmedAndGzipped.length + " bytes");
    }

    public static byte[] raw(InputStream input) throws IOException {
        ByteArrayOutputStream output = new ByteArrayOutputStream();
        for (int data; (data = input.read()) != -1; output.write(data));
        input.close(); output.close(); return output.toByteArray();
    }

    public static byte[] gzip(InputStream input) throws IOException {
        ByteArrayOutputStream output = new ByteArrayOutputStream();
        GZIPOutputStream gzip = new GZIPOutputStream(output);
        for (int data; (data = input.read()) != -1; gzip.write(data));
        input.close(); gzip.close(); return output.toByteArray();
    }

    public static byte[] trim(InputStream input) throws IOException {
        ByteArrayOutputStream output = new ByteArrayOutputStream();
        BufferedReader reader = new BufferedReader(new InputStreamReader(input));
        for (String line; (line = reader.readLine()) != null;) output.write(line.trim().getBytes());
        reader.close(); output.close(); return output.toByteArray();
    }

}
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
1

Another good reason not to minify your code is for learning. I love the ability to go and look through people's source code to see how they solve problems, and likewise I keep my source in full form so others can look at mine. I still have my code compressed through gzip before it's sent to the browser, but when it arrives it will be uncompressed into full form and fully readable.

Timothy Armstrong
  • 1,982
  • 2
  • 16
  • 19
0

I think few people do. Too much work, too little gain, esepcailly as the payload on HTTP can be zip-compressed these days.

TomTom
  • 61,059
  • 10
  • 88
  • 148
  • Minification and gzip compression aren't mutually exclusive. – Mike D. Mar 11 '10 at 13:23
  • Sure, but it means i dont really et too muc hbenefit from minification - bytecode wise. For many websites it simply will not be worth the effort. – TomTom Mar 11 '10 at 13:45
0

Gzip compression that every modern web server and web server make HTML compression (minification) useless or almost insignificant.

So very few use this.

Artyom
  • 31,019
  • 21
  • 127
  • 215
  • 1
    -1: [Citation Required]... Even IE6 supports gzipped JavaScript... Name me one browser which doesn't support gzipped JavaScript... (other than the browsers which doesn't support gzip at all: those don't return `Accept-Encoding: gzip` anyway, so they are easy to identify and deal with) – Andrew Moore Mar 11 '10 at 13:22
  • @Andrew, I removed this for you. I indeed do not have such citation, even thou JS minification is much more popular then HTML – Artyom Mar 11 '10 at 13:48