0

I am requesting an api with the link here.

As you see, the response is .gz format file which be downloaded.In fact, it is xml format file. My question is how to parse that response compressed using GZip in php script and echo xml format? Thank you!

header('content-type:text/html;charset=utf-8');
if (substr_count($_SERVER["HTTP_ACCEPT_ENCODING"], "gzip")) ob_start("ob_gzhandler"); else ob_start();
$appkey ='kwaninmacau';
$url ='http://services.eoddsmaker.net/demo/feeds/V1.0/markets.ashx';
$params ='l=1&bid=43&sid=50&cid=58&lid=10&u='.$appkey.'&p='.$appkey;
echo datafeedurl($url,$params,0);
function datafeedurl($url,$params=false,$ispost=0){
..
curl_setopt( $ch, CURLOPT_ENCODING , "gzip");
}
Jacky Kwan
  • 61
  • 10
  • User curl and Check this link http://stackoverflow.com/questions/310650/decode-gzipped-web-page-retrieved-via-curl-in-php – Satender K Nov 27 '15 at 05:34
  • No response.I had added to curl function options. – Jacky Kwan Nov 27 '15 at 05:43
  • Warning, this API is broken. it says Content-Type:application/x-gzip; charset=UTF-8 , but in fact, it gives you an XML file, with the filename markets_20151127T065210.gz – hanshenrik Nov 27 '15 at 06:53
  • Yes, you are right. In fact it is .xml file whose suffix is modified to .gz. That is important key issue.How to handle that? – Jacky Kwan Nov 27 '15 at 06:57

2 Answers2

0

quote "As you see, the response is .gz format", No it is not. the server SAYS Content-Type:application/x-gzip , but this wrong! it is an XML file, with the name "markets_20151127T065210.gz"

quote "how to parse that response compressed using GZip in php script and echo xml format? " your xml is in $xml by line 4 here:

<?php
$ch=hhb_curl_init();
$xml=hhb_curl_exec2($ch,'http://services.eoddsmaker.net/demo/feeds/V1.0/markets.ashx?l=1&bid=43&sid=50&cid=58&lid=10&u=kwaninmacau&p=kwaninmacau',$headers,$cookies,$requeststring);
var_dump('headers:',$headers,'cookies:',$cookies,'requeststring:',$requeststring,'xml:',$xml);

function hhb_curl_init($custom_options_array = array())
{
    if (empty($custom_options_array)) {
        $custom_options_array = array();
        //i feel kinda bad about this.. argv[1] of curl_init wants a string(url), or NULL
        //at least i want to allow NULL aswell :/
    }
    if (!is_array($custom_options_array)) {
        throw new InvalidArgumentException('$custom_options_array must be an array!');
    }
    ;
    $options_array = array(
        CURLOPT_AUTOREFERER => true,
        CURLOPT_BINARYTRANSFER => true,
        CURLOPT_COOKIESESSION => true,
        CURLOPT_FOLLOWLOCATION => true,
        CURLOPT_FORBID_REUSE => false,
        CURLOPT_HTTPGET => true,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_SSL_VERIFYPEER => false,
        CURLOPT_CONNECTTIMEOUT => 10,
        CURLOPT_TIMEOUT => 11,
        CURLOPT_ENCODING => ""
        //CURLOPT_REFERER=>'example.org',
        //CURLOPT_USERAGENT=>'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:36.0) Gecko/20100101 Firefox/36.0'
    );
    if (!array_key_exists(CURLOPT_COOKIEFILE, $custom_options_array)) {
        //do this only conditionally because tmpfile() call..
        static $curl_cookiefiles_arr = array(); //workaround for https://bugs.php.net/bug.php?id=66014
        $curl_cookiefiles_arr[]            = $options_array[CURLOPT_COOKIEFILE] = tmpfile();
        $options_array[CURLOPT_COOKIEFILE] = stream_get_meta_data($options_array[CURLOPT_COOKIEFILE]);
        $options_array[CURLOPT_COOKIEFILE] = $options_array[CURLOPT_COOKIEFILE]['uri'];

    }
    //we can't use array_merge() because of how it handles integer-keys, it would/could cause corruption
    foreach ($custom_options_array as $key => $val) {
        $options_array[$key] = $val;
    }
    unset($key, $val, $custom_options_array);
    $curl = curl_init();
    if($curl===false){
        throw new RuntimeException('could not create a curl handle! curl_init() returned false');
    }
    if(false===curl_setopt_array($curl, $options_array)){
        $errno=curl_errno($curl);
        $error=curl_error($curl);
        throw new RuntimeException('could not set options on curl! curl_setopt_array returned false. curl_errno :'.$curl_errno.'. curl_error: '.$curl_error);
    }
    return $curl;
}
function hhb_curl_exec($ch, $url)
{
    static $hhb_curl_domainCache = "";//warning, this will not work properly with 2 different curl's visiting 2 different sites. 
    //should probably use SplObjectStorage here, so each curl can have its own cache..
    //$hhb_curl_domainCache=&$this->hhb_curl_domainCache;
    //$ch=&$this->curlh;
    if (!is_resource($ch) || get_resource_type($ch) !== 'curl') {
        throw new InvalidArgumentException('$ch must be a curl handle!');
    }
    if (!is_string($url)) {
        throw new InvalidArgumentException('$url must be a string!');
    }

    $tmpvar = "";
    if (parse_url($url, PHP_URL_HOST) === null) {
        if (substr($url, 0, 1) !== '/') {
            $url = $hhb_curl_domainCache . '/' . $url;
        } else {
            $url = $hhb_curl_domainCache . $url;
        }
    }
    ;

    if(false===curl_setopt($ch, CURLOPT_URL, $url)){
        $errno=curl_errno($curl);
        $error=curl_error($curl);
        throw new RuntimeException('could not set CURLOPT_URL on curl! curl_setopt returned false. curl_errno :'.$curl_errno.'. curl_error: '.$curl_error.'. url: '.var_export($url,true));
    }
    $html = curl_exec($ch);
    if (curl_errno($ch)) {
        throw new Exception('Curl error (curl_errno=' . curl_errno($ch) . ') on url ' . var_export($url, true) . ': ' . curl_error($ch));
        // echo 'Curl error: ' . curl_error($ch);
    }
    if ($html === '' && 203 != ($tmpvar = curl_getinfo($ch, CURLINFO_HTTP_CODE)) /*203 is "success, but no output"..*/ ) {
        throw new Exception('Curl returned nothing for ' . var_export($url, true) . ' but HTTP_RESPONSE_CODE was ' . var_export($tmpvar, true));
    }
    ;
    //remember that curl (usually) auto-follows the "Location: " http redirects..
    $hhb_curl_domainCache = parse_url(curl_getinfo($ch, CURLINFO_EFFECTIVE_URL), PHP_URL_HOST);
    return $html;
}
function hhb_curl_exec2($ch, $url, &$returnHeaders = array(), &$returnCookies = array(), &$verboseDebugInfo = "")
{
    $returnHeaders    = array();
    $returnCookies    = array();
    $verboseDebugInfo = "";
    if (!is_resource($ch) || get_resource_type($ch) !== 'curl') {
        throw new InvalidArgumentException('$ch must be a curl handle!');
    }
    if (!is_string($url)) {
        throw new InvalidArgumentException('$url must be a string!');
    }
    $verbosefileh = tmpfile();
    if($verbosefileh===false){
        throw new RuntimeException('can not create a tmpfile for curl\'s stderr. tmpfile returned false');
    }
    $verbosefile  = stream_get_meta_data($verbosefileh);
    $verbosefile  = $verbosefile['uri'];
    curl_setopt($ch, CURLOPT_VERBOSE, 1);
    curl_setopt($ch, CURLOPT_STDERR, $verbosefileh);
    curl_setopt($ch, CURLOPT_HEADER, 1);
    $html             = hhb_curl_exec($ch, $url);
    $verboseDebugInfo = file_get_contents($verbosefile);
    curl_setopt($ch, CURLOPT_STDERR, NULL);
    fclose($verbosefileh);
    unset($verbosefile, $verbosefileh);
    $headers       = array();
    $crlf          = "\x0d\x0a";
    $thepos        = strpos($html, $crlf . $crlf, 0);
    $headersString = substr($html, 0, $thepos);
    $headerArr     = explode($crlf, $headersString);
    $returnHeaders = $headerArr;
    unset($headersString, $headerArr);
    $htmlBody = substr($html, $thepos + 4); //should work on utf8/ascii headers... utf32? not so sure..
    unset($html);
    //I REALLY HOPE THERE EXIST A BETTER WAY TO GET COOKIES.. good grief this looks ugly..
    //at least it's tested and seems to work perfectly...
    $grabCookieName = function($str,&$len)
    {
        $len=0;
        $ret = "";
        $i   = 0;
        for ($i = 0; $i < strlen($str); ++$i) {
            ++$len;
            if ($str[$i] === ' ') {
                continue;
            }
            if ($str[$i] === '=') {
                --$len;
                break;
            }
            $ret .= $str[$i];
        }
        return urldecode($ret);
    };
    foreach ($returnHeaders as $header) {
        //Set-Cookie: crlfcoookielol=crlf+is%0D%0A+and+newline+is+%0D%0A+and+semicolon+is%3B+and+not+sure+what+else
        /*Set-Cookie:ci_spill=a%3A4%3A%7Bs%3A10%3A%22session_id%22%3Bs%3A32%3A%22305d3d67b8016ca9661c3b032d4319df%22%3Bs%3A10%3A%22ip_address%22%3Bs%3A14%3A%2285.164.158.128%22%3Bs%3A10%3A%22user_agent%22%3Bs%3A109%3A%22Mozilla%2F5.0+%28Windows+NT+6.1%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F43.0.2357.132+Safari%2F537.36%22%3Bs%3A13%3A%22last_activity%22%3Bi%3A1436874639%3B%7Dcab1dd09f4eca466660e8a767856d013; expires=Tue, 14-Jul-2015 13:50:39 GMT; path=/
        Set-Cookie: sessionToken=abc123; Expires=Wed, 09 Jun 2021 10:18:14 GMT;
        //Cookie names cannot contain any of the following '=,; \t\r\n\013\014'
        //
        */
        if (stripos($header, "Set-Cookie:") !== 0) {
            continue;
            /**/
        }
        $header = trim(substr($header, strlen("Set-Cookie:")));
        $len=0;
        while (strlen($header) > 0) {
            $cookiename                 = $grabCookieName($header,$len);
            $returnCookies[$cookiename] = '';
            $header                     = substr($header, $len + 1); //also remove the = 
            if (strlen($header) < 1) {
                break;
            }
            ;
            $thepos = strpos($header, ';');
            if ($thepos === false) { //last cookie in this Set-Cookie.
                $returnCookies[$cookiename] = urldecode($header);
                break;
            }
            $returnCookies[$cookiename] = urldecode(substr($header, 0, $thepos));
            $header                     = trim(substr($header, $thepos + 1)); //also remove the ;
        }
    }
    unset($header, $cookiename, $thepos);
    return $htmlBody;
}
hanshenrik
  • 19,904
  • 4
  • 43
  • 89
  • response as you post script is [link](http://361goal.com/new2.php). xml is error code???? – Jacky Kwan Nov 27 '15 at 07:11
  • In fact, if you download xxx.gz file, then modify its suffix name to .xml, you will get full xml content. That will be no problem! – Jacky Kwan Nov 27 '15 at 07:13
  • @JackyKwan that worked! your xml file says ....... i dont know if thats what you want, or if that's an error.. anyway, replace the var_dump with echo $xml; and youll get your xml :p – hanshenrik Nov 27 '15 at 07:14
  • Hi cool man, I believe that you got the right result. However why did I get blank page by this [link](http://361goal.com/new2.php) ? Really I `echo $xml` – Jacky Kwan Nov 27 '15 at 07:25
  • `Warning: curl_setopt(): supplied argument is not a valid File-Handle resource in /home/andy15703166/public_html/new2.php on line 123` – Jacky Kwan Nov 27 '15 at 07:28
  • @JackyKwan that is far from a blank page.. it just doesnt show anything by parsing HTML. go to that page, and look at "page source" in your browser, you get stuff like: – hanshenrik Nov 27 '15 at 10:03
-1

You can use below code for Gzip

curl_setopt($ch,CURLOPT_ENCODING , "gzip");
Jalpa
  • 697
  • 3
  • 13