2

I am using below script to get data from a website. data is return but it is in gzip or some encoded format. I tried to use gzdecode but it is not working on it. is there any way to see clean data from this request.

I use

                curl_setopt($ch, CURLOPT_ENCODING , 'deflate');
                curl_setopt($ch, CURLOPT_ENCODING , 'gzip');
                curl_setopt($ch, CURLOPT_ENCODING , 'br');

but none of them is working. below is curl request

            $ch = curl_init();
            curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
            curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
            curl_setopt($ch, CURLOPT_URL, 'https://www.example.com');
            curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 20);
            curl_setopt($ch, CURLOPT_TIMEOUT, 20);
            curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
            curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
            curl_setopt($ch, CURLOPT_ENCODING , 'deflate');
            $response = curl_exec($ch);

            $d = curl_getinfo( $ch );

curl_getinfo is showing below

I can see that site is using "br" encoding i.e Content-Encoding: br

Bilal Rabbi
  • 1,602
  • 2
  • 18
  • 39

3 Answers3

0

br encoding is Brotli encoding. You can pass it in the Accept-Encoding header with curl_setopt($ch, CURLOPT_ENCODING , 'br'), but it won't be handled by curl, i.e., you will have to decode the output explicitly.

You can probably use this PHP extension: https://github.com/kjdev/php-ext-brotli

You can also try to use curl_setopt($ch, CURLOPT_ENCODING , 'identity'), and, if the server you are calling behaves properly, get the data uncompressed.

I guess you've already tried to leave the Accept-Encoding header completely out. Unfortunately, according the specs, this does not prevent the output to be encoded.

Community
  • 1
  • 1
Walter Tross
  • 12,237
  • 2
  • 40
  • 64
0

In header i allowed gzip and deflate only and removed br and it worked for me. So instead of this $header[] = 'Accept-Encoding: gzip, deflate, br'; i used $header[] = 'Accept-Encoding: gzip, deflate';

Thanks for help every one.

Bilal Rabbi
  • 1,602
  • 2
  • 18
  • 39
  • you should never post an incomplete question. If we knew that you had filled in `$header` like that (which makes no sense, since you normally use the `CURLOPT_ENCODING` for that - which you used too) we would not have lost tens of minutes of our time trying to figure out how you could fix your problem. – Walter Tross Aug 27 '18 at 06:45
  • Sorry sir. will take care next time @WalterTross – Bilal Rabbi Aug 27 '18 at 13:18
0
            curl_setopt($ch, CURLOPT_ENCODING , 'deflate');
            curl_setopt($ch, CURLOPT_ENCODING , 'gzip');
            curl_setopt($ch, CURLOPT_ENCODING , 'br');

subsequent calls overwrite the previous value, it doesn't add to the previous value. if you want to support deflate, gzip, and br, then separate them with comma, eg

            curl_setopt($ch, CURLOPT_ENCODING , 'gzip,deflate,br');

however, br is a recent addition to curl, br support was first added to curl at version 7.57.0, released at November 29 2017, so you might want to add

if(!definied("CURL_VERSION_BROTLI")){
// https://github.com/curl/curl/blob/f762fec323f36fd7da7ad6eddfbbae940ec3229e/include/curl/curl.h#L2720
    define("CURL_VERSION_BROTLI",(1<<23));
}
if(!(curl_version()["features"] & CURL_VERSION_BROTLI)){
    throw new \RuntimeException("this script requires brotli support added to libcurl (added in libcurl version 7.57.0, released November 29 2017), please update your libcurl installation.");
}

to ensure that br is actually supported by your php's libcurl, if you require it.

hanshenrik
  • 19,904
  • 4
  • 43
  • 89