2

I recently wanted to fetch and decode API response from a web service. I thought that just just file_get_contents and then json_decode the resulting string should work.

It looks like I have to deal with gzipped response and malformed JSON to finally decode the string. How can I handle these?

Flexo
  • 87,323
  • 22
  • 191
  • 272
Murdani Eko
  • 195
  • 1
  • 5

2 Answers2

2

Recently I wanted to fetch and decode API response from a web service, then found out that it was a lot more than just file_get_contents and json_decode the string. I have to deal with gzipped response and malformed JSON to finally decode the string.

After hours of searching, both functions below had just saved my day.

// http://stackoverflow.com/questions/8895852/uncompress-gzip-compressed-http-response
if ( ! function_exists('gzdecode')) {
/**
 * Decode gz coded data
 * 
 * http://php.net/manual/en/function.gzdecode.php
 * 
 * Alternative: http://digitalpbk.com/php/file_get_contents-garbled-gzip-encoding-website-scraping
 * 
 * @param string $data gzencoded data
 * @return string inflated data
 */
function gzdecode($data)     {
    // strip header and footer and inflate

    return gzinflate(substr($data, 10, -8));
}
}


/**
 * Fetch the requested URL and return it as decoded json object
 * 
 * @author string  Murdani Eko
 * @param  string  $url
 */
function get_json_decode( $url ) {

  $response = file_get_contents( $url );
  $response = trim( $response );

  // is it a valid json string?
  $jsondecoded = json_decode( $response );
  if( json_last_error() == JSON_ERROR_NONE ) {
    return $jsondecoded;
  }

  // yay..! it's a gzencoded string
  if( json_last_error() == JSON_ERROR_UTF8 ) {
    $response = gzdecode($response);

    /* After gzdecoded, there is a chance that the response 
     * will have extra character after the curly brackets e.g. }}gi or }} ee
     * This will cause malformed JSON, and later failed json decoding
     */

    // we search-reverse the closing curly bracket position
    $last_curly_pos = strrpos($response, '}');
    $last_curly_pos++; 

    // extract the correct json format using the last curly bracket position
    $good_response = substr($response, 0, $last_curly_pos);

    return json_decode( $good_response );
  }
}
Flexo
  • 87,323
  • 22
  • 191
  • 272
Murdani Eko
  • 195
  • 1
  • 5
  • 1
    It's fine to ask and answer your own question, in fact we quite like it - although we ask that you split them up as complete, individual questions and answers. I've taken the "answer" part of your question and moved it here for you. – Flexo Dec 14 '13 at 15:24
  • 1
    Sorry for my previous self-QA format. I'll do it better next time. Thanks for your time editing my post. I really appreciate it – Murdani Eko Dec 16 '13 at 21:40
2

you can use curl instead of file_get_contents and get page content without any encoding

   function get_url($link){

      $ch = curl_init();
      curl_setopt($ch, CURLOPT_HEADER, 0);
      curl_setopt($ch, CURLOPT_VERBOSE, 0);
      curl_setopt($ch,CURLOPT_ENCODING, '');
      curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
      curl_setopt($ch, CURLOPT_URL, ($link));
      $response = curl_exec($ch);
      curl_close($ch);
      return ($response); 


    }
max
  • 3,614
  • 9
  • 59
  • 107
  • Well thanks, Max. Your cURL just really works. I have Googled and Binged my problem for hours to finally write my function above. I have read tens of stackoverflow answers but none of them works. I tried for cURL previously, but wasn't working at all, since the response still returning gzipped content. Maybe the curl_setopt($ch,CURLOPT_ENCODING, ''); options which solved everything in one line. I haven't use that before. – Murdani Eko Dec 16 '13 at 22:48
  • @MurdaniEko exactly , you can put any encoding you want with CURLOPT_ENCODING or you can send it empty like in the code and get the page without any encoding , btw you can accept my answer by clicking on the tick – max Dec 17 '13 at 00:07