I have binary data inside a bytearray that I would like to gzip first and then post via requests. I found out how to gzip a file but couldn't find it out for a bytearray. So, how can I gzip a bytearray via Python?
Asked
Active
Viewed 2.1k times
12
-
The ugly way: save it in a file :P. But maybe this will help: http://stackoverflow.com/questions/8506897/how-do-i-gzip-compress-a-string-in-python – Vincent Beltman Nov 05 '14 at 09:14
4 Answers
17
Have a look at the zlib
-module of Python.
Python 3: zlib
-module
A short example:
import zlib
compressed_data = zlib.compress(my_bytearray)
You can decompress the data again by:
decompressed_byte_data = zlib.decompress(compressed_data)
Python 2: zlib
-module
A short example:
import zlib
compressed_data = zlib.compress(my_string)
You can decompress the data again by:
decompressed_string = zlib.decompress(compressed_data)
As you can see, Python 3 uses bytearrays while Python 2 uses strings.

mozzbozz
- 3,052
- 5
- 31
- 44
-
2For me it wanted a string, instead of bytearray.. `compressedData = zlib.compress(mystring)` – mgear May 27 '16 at 03:24
-
3@mgear That's because you are using Python 2 which expects an input string - in Python 3, the function expects an bytearray... I've added this to my answer. – mozzbozz Jun 05 '16 at 13:53
-
From https://stackoverflow.com/a/8507012/8046487: "Note that this method is incompatible with the gzip command-line utility in that gzip includes a header and checksum, while this mechanism simply compresses the content." ; "If you want to produce a complete gzip-compatible binary string, with the header etc, you could use gzip.GzipFile together with StringIO" – Mathieu Rey Apr 18 '20 at 08:02
1
In case the bytearray is not too large to be stored in memory more than once and known as b
, you can just:
b_gz = str(b).encode('zlib')
If you need to do deocding first, have a look at the decode()
method of the bytearray.

Klaus D.
- 13,874
- 5
- 41
- 48
-
1I'm not sure if this is correct? If you call `str()` on a bytearray you get something like `"bytearray(b'test')"` - but he want's to compress the bytearray and not some string describing the bytearray (I think this could also result in a loss of data in some special circumstances?). – mozzbozz Nov 05 '14 at 09:29
-
1I just see that the code works in Python 2 only. In Python 3 there have been some changes in that area. – Klaus D. Nov 05 '14 at 09:43
-
2Ok, that's what I've thought (I'm mainly working with Python 3). You might put a note in your post for people not reading the comments on the "first try". – mozzbozz Nov 05 '14 at 12:34
1
The zlib module of Python Standard Library should meet your requirements :
>>> import zlib
>>> a = b'abcdefghijklmn' * 10
>>> ca = zlib.compress(a)
>>> len(a)
140
>>> len(ca)
25
>>> b = zlib.decompress(ca)
>>> b == a
True
>>> b
b'abcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmn'
This is the output under Python3.4, but it works same under Python 2.7 -

Serge Ballesta
- 143,923
- 11
- 122
- 252
1
import zlib
import binascii
def compress_packet(packet):
return zlib.compress(buffer(packet),1)
def decompress_packet(compressed_packet):
return zlib.decompress(compressed_packet)
def demo_zlib() :
packet1 = bytearray()
packet1.append(0x41)
packet1.append(0x42)
packet1.append(0x43)
packet1.append(0x44)
print "before compression: packet:{0}".format(binascii.hexlify(packet1))
cpacket1 = compress_packet(packet1)
print "after compression: packet:{0}".format(binascii.hexlify(cpacket1))
print "before decompression: packet:{0}".format(binascii.hexlify(cpacket1))
dpacket1 = decompress_packet(buffer(cpacket1))
print "after decompression: packet:{0}".format(binascii.hexlify(dpacket1))
def main() :
demo_zlib()
if __name__ == '__main__' :
main()
This should do. The zlib requires access to bytearray content, use buffer() for that.