5

The question is pretty self explanatory. Why shouldn't I strip it? It seems to me that most of the whitespace is used purely for formatting in the text editor and has no impact on the final page.

What's more, when these random nodes of whitespace do have an impact on the final page, it is usually an impact I do not want, such as a mysterious one character (after whitespace collapse) gap between inline-blocks.

I can strip all these whitespace text nodes pretty easily. Is there any reason I shouldn't?

edit:

It's mainly for the strange behaviour where whitespace, rather than for performance. One example is me wanting to put images side by side using inline-block instead of float, while preventing wrapping to next line and allowing them to spill out of the parent.

The whitespace causes these mysterious gaps, which can be removed by basically minifying the HTML source code to remove the whitespace between inline-blocks manually (and completely messing up your source code formatting in the process).

Li Haoyi
  • 15,330
  • 17
  • 80
  • 137

4 Answers4

7

There's no reason not to, really. It can be done very easily with something like htmlcompressor.

However, assuming you're delivering all your html, css, and js files via gzip, then the amount of real-world bandwidth savings you'll see from stripping whitespace will be very small. The question then becomes, is it worth the trouble?

UPDATE:

Perhaps this will affect your decision. I performed a simple minification on a page of my website just to see what kind of difference it would make. Here are the results:

BEFORE minification

  • 22232 bytes (uncompressed)
  • 5276 bytes (gzip)

AFTER minification

  • 19207 bytes (uncompressed)
  • 5146 bytes (gzip) - 130 bytes saved

The uncompressed file is about 3 KB smaller after minification. But that's not really what matters. The gzip compressed file is what is sent over the wire. And you can clearly see that gzip does a pretty good job even with the non-minified HTML.

I see the benefit of minifying js libraries, or things that aren't changing constantly. But I don't think it's worth the trouble doing this to your HTML for a measly 130 bytes.

Steve Wortham
  • 21,740
  • 5
  • 68
  • 90
  • As well as the bandwith saving (which as you point out is miniscule) in my experience you sometimes get better parsing performance of minified JS cf. the unminified version. – Chris Fewtrell Oct 01 '12 at 14:59
4

Let me give one reason why you shouldn't minify html:

How html eventually gets rendered is strongly tie to the CSS applied up on it, but the minifiers usually work without expecting the influence of CSS. All minifiers you can get out there at the time of writing, they remove the spaces in html based on certain assumptions of your coding and CSS styling, if you don't code it the way they expected, the minified rendering result in browser will be different from before minification.

For example, some of minifiers assume the space between "block elements" (such as <div/>, <p/>) can be removed, this is usually true, because spaces between them has no effect on rendering the final result. But what if in the CSS you set "display: inline" or "inline-block" for elements whose default display property is block?

Will below html snippet still rendering as it should be if you remove the spaces between <div/>s ?

<div style="display: inline">will</div> <div style="display: inline">this</div> <div style="display: inline">still</div> <div style="display: inline">work?</div>

You may argue that, we can reserve at least 1 space, and remove remaining consecutive spaces and that still save a lot bytes. Then how about <pre> tag and white-space: pre?

Try copy the html code snippet from below url and paste into your minifier, see if it produces result as before the minification:

https://jsfiddle.net/normanzb/58rpazL2/

Norman Xu
  • 1,364
  • 1
  • 9
  • 5
2

The only downside of stripping out whitespace from production pages is readability, and maintainability for the person that follows you in editing that/those page(s); but if you maintain a 'properly'/'readable' whitespaced-version for editing, and then minify that post-editing to form the production pages then it doesn't really cause significant problems.

I'm not sure how effective, or useful, the technique will be, but there's nothing to stop you trying it.

David Thomas
  • 249,100
  • 51
  • 377
  • 410
  • +1. But: `I'm not sure how effective, or useful, the technique will be`? If he can maintain 2 copies like that he should definitely do it. Those whitespace characters are just increasing the total download size of the page! – Joseph Silber Aug 18 '11 at 23:13
  • They are, but not significantly; and that's not the point of the question (I think...). He touches on the problems presented by whitespace ('mysterious one character gap[s]'), which would be reduced by whitespace removal, but not necessarily prevented by gzipping/compressed pages (depending on the settings of the compressor tool). – David Thomas Aug 18 '11 at 23:15
  • On the readability front, I was considering that the way I generate the HTML, the raw HTML indentation is all already completely screwed up, but Inspect Element/Firebug/IE9 Dev Tools will nicely format and indent it anyway. I am spending basically no time looking at the raw HTML; it's all spend viewing it through one of the above tools. – Li Haoyi Aug 19 '11 at 00:05
1

Short answer: no reason whatsoever

The only real purpose white space serves is to make the code more human-readable. You can, over time, save a lot of bandwidth by stripping all the unnecessary white space out of your documents and it should be considered good practice for production code. If your compressing your content the saving will be less, but even 1% of 1GB is 10MB... If your doing 100GB in a month on a busy web site, cutting out 1% of the data might be the difference between two pricing tiers of hosting...

As you say, some browsers (usually IE, grrrr....) will occasionally interpret the white space when they render the page, but usually when this happens it's in a way you'd rather it hadn't...

DaveRandom
  • 87,921
  • 11
  • 154
  • 174
  • IE9 was exactly what made me think of this! I'm not entirely familiar with the spec; are these whitespace nodes meant to be rendered or not? In Chrome/Safari/Firefox they weren't, but IE9 gave me 1 character gaps GRARGH – Li Haoyi Aug 18 '11 at 23:59
  • White space should never be rendered. 1 white space character = 1 literal space, but any additional characters should be truncated. IE6 had an annoying habit of behaving as if white space after a closing `` element was a `
    ` - that caught me out a few times!
    – DaveRandom Aug 19 '11 at 00:10