I am encountering issues ungzipping chunks of bytes that I am reading from S3 using the iter_chunks()
method from boto3
. The strategy of ungzipping the file chunk-by-chunk originates from this issue.
The code is as follows:
dec = zlib.decompressobj(32 + zlib.MAX_WBITS)
for chunk in app.s3_client.get_object(Bucket=bucket, Key=key)["Body"].iter_chunks(2 ** 19):
data = dec.decompress(chunk)
print(len(chunk), len(data))
# 524288 65505
# 524288 0
# 524288 0
# ...
This code initially prints out the value of 65505
followed thereafter by 0 for every subsequent iteration. My understanding is that this code should ungzip each compressed chunk, and then print the length of the uncompressed version.
Is there something I'm missing?