0

I'm using the following regular expression found in this post: (?<=\s)\s+(?![^<>]*)

When I do this:

echo gzencode( trim( preg_replace('/(?<=\s)\s+(?![^<>]*<\/pre>)/', '', $html) ), 9);

The spaces are replaced in all html. even inside pre tags. I need this to compress the entire page.

Community
  • 1
  • 1
Sergio Flores
  • 5,231
  • 5
  • 37
  • 60
  • 3
    If you do it that way, [you're asking for a load of trouble](http://stackoverflow.com/a/1732454/1128047)! Use an HTML parser to handle this instead. But, while we're on the topic, why remove white space to begin with? HTML ignores it in the vast majority of cases. – Jonah Bishop Mar 14 '13 at 20:12
  • 7
    **Don't use regular expressions to parse HTML**. You cannot reliably parse HTML with regular expressions, and you will face sorrow and frustration down the road. As soon as the HTML changes from your expectations, your code will be broken. See http://htmlparsing.com/php for examples of how to properly parse HTML with PHP modules that have already been written, tested and debugged. – Andy Lester Mar 14 '13 at 20:14
  • 2
    The above will not replace spaces between `
    ` and `
    ` unless there are `<` or `>` present between those tags.
    – MikeM Mar 14 '13 at 20:17
  • 1
    amen on Andy's comment - just use a tool and not a regex. The tools are designed to not mess up your HTML. See this link: http://nadeausoftware.com/articles/2007/03/don_t_use_html_white_space_removal_speed_web_site – user1914292 Mar 14 '13 at 22:20

1 Answers1

1

Try this:

echo gzencode( trim( preg_replace('/\s+(?![^<>]*<\/pre>)/', '', $html) ), 9);

=)

Arthur Ronconi
  • 2,290
  • 25
  • 25
  • Maybe you could add some details as to how this works? This make your answer more useful for others that may come across it in the future. – slm Apr 14 '13 at 05:13