9

After making a gzip deflate request in PHP, I receive the deflated string in offset chunks, which looks like the following

Example shortened greatly to show format:

00001B4E
¾”kŒj…Øæ’ìÑ«F1ìÊ`+ƒQì¹UÜjùJƒZ\µy¡ÓUžGr‡J&=KLËÙÍ~=ÍkR
0000102F
ñÞœÞôΑüo[¾”+’Ñ8#à»0±R-4VÕ’n›êˆÍ.MCŽ…ÏÖr¿3M—èßñ°r¡\+
00000000

I'm unable to inflate that presumably because of the chunked format. I can confirm the data is not corrupt after manually removing the offsets with a Hex editor and reading the gzip archive. I'm wondering if there's a proper method to parse this chunked gzip deflated response into a readable string?

I might be able to split these offsets and join the data together in one string to call gzinflate, but it seems there must be an easier way.

user1309276
  • 109
  • 1
  • 6

3 Answers3

11

The proper method to deflate a chunked response is roughly as follows:

initialise string to hold result
for each chunk {
  check that the stated chunk length equals the string length of the chunk
  append the chunk data to the result variable
}

Here's a handy PHP function to do that for you (FIXED):

function unchunk_string ($str) {

  // A string to hold the result
  $result = '';

  // Split input by CRLF
  $parts = explode("\r\n", $str);

  // These vars track the current chunk
  $chunkLen = 0;
  $thisChunk = '';

  // Loop the data
  while (($part = array_shift($parts)) !== NULL) {
    if ($chunkLen) {
      // Add the data to the string
      // Don't forget, the data might contain a literal CRLF
      $thisChunk .= $part."\r\n";
      if (strlen($thisChunk) == $chunkLen) {
        // Chunk is complete
        $result .= $thisChunk;
        $chunkLen = 0;
        $thisChunk = '';
      } else if (strlen($thisChunk) == $chunkLen + 2) {
        // Chunk is complete, remove trailing CRLF
        $result .= substr($thisChunk, 0, -2);
        $chunkLen = 0;
        $thisChunk = '';
      } else if (strlen($thisChunk) > $chunkLen) {
        // Data is malformed
        return FALSE;
      }
    } else {
      // If we are not in a chunk, get length of the new one
      if ($part === '') continue;
      if (!$chunkLen = hexdec($part)) break;
    }
  }

  // Return the decoded data of FALSE if it is incomplete
  return ($chunkLen) ? FALSE : $result;

}
DaveRandom
  • 87,921
  • 11
  • 154
  • 174
  • Excellent, works just as expected. That is a handy PHP function indeed, I've been seeking this for awhile now. Much thanks! – user1309276 Apr 03 '12 at 13:22
  • @user1309276 I have updated the above function, it had an error surrounding the behaviour when the string contains a literal CRLF. This has now been fixed, and this has also provided better detection of malformed strings. – DaveRandom Apr 03 '12 at 13:42
  • 1
    Thanks again! For anyone still having problems, after calling unchunk_string all I need to do is remove the first 10 bytes using: $data = gzinflate(substr($data,10)); – user1309276 Apr 03 '12 at 13:59
3

To decode a String use gzinflate, Zend_Http_Client lib will help to do this kind of common tasks, its wasy to use, Refer Zend_Http_Response code if you need to do it on your own

Sandeep Manne
  • 6,030
  • 5
  • 39
  • 55
  • Unfortunately I already tried the method that lib uses, but it does contain some code I might need in the future, thanks! – user1309276 Apr 03 '12 at 13:26
1

The solution from user @user1309276 really helped me! Received from the server a gzip-compressed json response with transfer-encoding: chunked header. None of the solutions helped. This solution works like magic for me! It just remove the first 10 bytes.

$data = json_decode(gzinflate(substr($response->getContent(), 10)), true);