Remove white space from entire Html but inside pre with regular expressions in php

Question

I'm using the following regular expression found in this post: (?<=\s)\s+(?![^<>]*)

When I do this:

echo gzencode( trim( preg_replace('/(?<=\s)\s+(?![^<>]*<\/pre>)/', '', $html) ), 9);

The spaces are replaced in all html. even inside pre tags. I need this to compress the entire page.

If you do it that way, [you're asking for a load of trouble](http://stackoverflow.com/a/1732454/1128047)! Use an HTML parser to handle this instead. But, while we're on the topic, why remove white space to begin with? HTML ignores it in the vast majority of cases. — Jonah Bishop, Mar 14 '13 at 20:12
**Don't use regular expressions to parse HTML**. You cannot reliably parse HTML with regular expressions, and you will face sorrow and frustration down the road. As soon as the HTML changes from your expectations, your code will be broken. See http://htmlparsing.com/php for examples of how to properly parse HTML with PHP modules that have already been written, tested and debugged. — Andy Lester, Mar 14 '13 at 20:14
The above will not replace spaces between `
` and `
` unless there are `<` or `>` present between those tags. — MikeM, Mar 14 '13 at 20:17
amen on Andy's comment - just use a tool and not a regex. The tools are designed to not mess up your HTML. See this link: http://nadeausoftware.com/articles/2007/03/don_t_use_html_white_space_removal_speed_web_site — user1914292, Mar 14 '13 at 22:20

Arthur Ronconi · Accepted Answer · 2013-05-07T15:39:19.770

1

Try this:

echo gzencode( trim( preg_replace('/\s+(?![^<>]*<\/pre>)/', '', $html) ), 9);

=)

edited May 07 '13 at 15:39

answered Apr 12 '13 at 01:20

Arthur Ronconi

Maybe you could add some details as to how this works? This make your answer more useful for others that may come across it in the future. – slm Apr 14 '13 at 05:13

1 Answers1