62

I'm retrieving a gzipped web page via curl, but when I output the retrieved content to the browser I just get the raw gzipped data. How can I decode the data in PHP?

One method I found was to write the content to a tmp file and then ...

$f = gzopen($filename,"r");
$content = gzread($filename,250000);
gzclose($f);

.... but man, there's got to be a better way.

Edit: This isn't a file, but a gzipped html page returned by a web server.

Ian
  • 11,920
  • 27
  • 61
  • 77
  • Is the file itself gzipped, or is the server gzipping it for transfer? – Artelius Nov 22 '08 at 01:05
  • 2
    Instead of decoding the gzip data, could you just send the correct headers so that the browser recognizes it properly? Or, if you don't want it gzipped in the first place, tell cURL not to ask for gzipped data by setting CURLOPT_ENCODING to "identity". – nobody Nov 22 '08 at 01:19
  • There is a PHP function called **gzdecode** that applies to strings, not files. – pgr Oct 21 '22 at 15:59

2 Answers2

154

The following command enables cURL's "auto encoding" mode, where it will announce to the server which encoding methods it supports (via the Accept-Encoding header), and then automatically decompress the response for you:

// Allow cURL to use gzip compression, or any other supported encoding
// A blank string activates 'auto' mode
curl_setopt($ch, CURLOPT_ENCODING , '');

If you specifically want to force the header to be Accept-Encoding: gzip you can use this command instead:

// Allow cURL to use gzip compression, or any other supported encoding
curl_setopt($ch, CURLOPT_ENCODING , 'gzip');

Read more in the PHP documentation: curl_setopt.

Thanks to commenters for helping improve this answer.

Simon East
  • 55,742
  • 17
  • 139
  • 133
Jonas Lejon
  • 3,189
  • 3
  • 28
  • 26
  • 23
    Just to note that this option sets the `Accept-Encoding: gzip` header on the request *and* uncompresses the response if it is compressed (it may not be), so it is indeed all you need to do. – Synchro Jun 04 '13 at 08:07
  • Perfect solution for CURL. – The Onin Oct 20 '16 at 19:00
  • 15
    Setting it to `'gzip'` will *always* send `Accept-Encoding: gzip`, even when your PHP version doesn't support decoding gzip (you'll get the compressed data then). If you set it to `''` (empty string), curl will automatically announce and decode all encodings that it supports. – AndreKR Feb 01 '17 at 06:46
  • @AndreKR - that's brilliant. Just read this in the docs *"If an empty string, "", is set, a header containing all supported encoding types is sent."* – But those new buttons though.. Oct 05 '21 at 19:05
  • 1
    I updated your great answer to be a bit more thorough, based on the comments above. Hope that's OK! – Simon East Jan 15 '23 at 06:28
5

Versatile GUNZIP function:

   function gunzip($zipped) {
      $offset = 0;
      if (substr($zipped,0,2) == "\x1f\x8b")
         $offset = 2;
      if (substr($zipped,$offset,1) == "\x08")  {
         # file_put_contents("tmp.gz", substr($zipped, $offset - 2));
         return gzinflate(substr($zipped, $offset + 8));
      }
      return "Unknown Format";
   }  

Example of integrating function with CURL:

      $headers_enabled = 1;
      curl_setopt($c, CURLOPT_HEADER,  $headers_enabled)
      $ret = curl_exec($c);

      if ($headers_enabled) {
         # file_put_contents("preungzip.html", $ret);

         $sections = explode("\x0d\x0a\x0d\x0a", $ret, 2);
         while (!strncmp($sections[1], 'HTTP/', 5)) {
            $sections = explode("\x0d\x0a\x0d\x0a", $sections[1], 2);
         }
         $headers = $sections[0];
         $data = $sections[1];

         if (preg_match('/^Content-Encoding: gzip/mi', $headers)) {
            printf("gzip header found\n");
            return gunzip($data);
         }
      }

      return $ret;