12

I have binary data inside a bytearray that I would like to gzip first and then post via requests. I found out how to gzip a file but couldn't find it out for a bytearray. So, how can I gzip a bytearray via Python?

Canol Gökel
  • 1,168
  • 2
  • 13
  • 29
  • The ugly way: save it in a file :P. But maybe this will help: http://stackoverflow.com/questions/8506897/how-do-i-gzip-compress-a-string-in-python – Vincent Beltman Nov 05 '14 at 09:14

4 Answers4

17

Have a look at the zlib-module of Python.

Python 3: zlib-module

A short example:

import zlib
compressed_data = zlib.compress(my_bytearray)

You can decompress the data again by:

decompressed_byte_data = zlib.decompress(compressed_data)

Python 2: zlib-module

A short example:

import zlib
compressed_data = zlib.compress(my_string)

You can decompress the data again by:

decompressed_string = zlib.decompress(compressed_data)

As you can see, Python 3 uses bytearrays while Python 2 uses strings.

mozzbozz
  • 3,052
  • 5
  • 31
  • 44
  • 2
    For me it wanted a string, instead of bytearray.. `compressedData = zlib.compress(mystring)` – mgear May 27 '16 at 03:24
  • 3
    @mgear That's because you are using Python 2 which expects an input string - in Python 3, the function expects an bytearray... I've added this to my answer. – mozzbozz Jun 05 '16 at 13:53
  • From https://stackoverflow.com/a/8507012/8046487: "Note that this method is incompatible with the gzip command-line utility in that gzip includes a header and checksum, while this mechanism simply compresses the content." ; "If you want to produce a complete gzip-compatible binary string, with the header etc, you could use gzip.GzipFile together with StringIO" – Mathieu Rey Apr 18 '20 at 08:02
1

In case the bytearray is not too large to be stored in memory more than once and known as b, you can just:

b_gz = str(b).encode('zlib')

If you need to do deocding first, have a look at the decode() method of the bytearray.

Klaus D.
  • 13,874
  • 5
  • 41
  • 48
  • 1
    I'm not sure if this is correct? If you call `str()` on a bytearray you get something like `"bytearray(b'test')"` - but he want's to compress the bytearray and not some string describing the bytearray (I think this could also result in a loss of data in some special circumstances?). – mozzbozz Nov 05 '14 at 09:29
  • 1
    I just see that the code works in Python 2 only. In Python 3 there have been some changes in that area. – Klaus D. Nov 05 '14 at 09:43
  • 2
    Ok, that's what I've thought (I'm mainly working with Python 3). You might put a note in your post for people not reading the comments on the "first try". – mozzbozz Nov 05 '14 at 12:34
1

The zlib module of Python Standard Library should meet your requirements :

>>> import zlib
>>> a = b'abcdefghijklmn' * 10
>>> ca = zlib.compress(a)
>>> len(a)
140
>>> len(ca)
25
>>> b = zlib.decompress(ca)
>>> b == a
True
>>> b
b'abcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmnabcdefghijklmn'

This is the output under Python3.4, but it works same under Python 2.7 -

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
1
import zlib 
import binascii


def compress_packet(packet):
    return zlib.compress(buffer(packet),1)

def decompress_packet(compressed_packet):
    return zlib.decompress(compressed_packet)

def demo_zlib() :

    packet1 = bytearray()
    packet1.append(0x41)
    packet1.append(0x42)
    packet1.append(0x43)
    packet1.append(0x44)

    print "before compression: packet:{0}".format(binascii.hexlify(packet1))
    cpacket1 = compress_packet(packet1)
    print "after compression: packet:{0}".format(binascii.hexlify(cpacket1))

    print "before decompression: packet:{0}".format(binascii.hexlify(cpacket1))
    dpacket1 = decompress_packet(buffer(cpacket1))
    print "after decompression: packet:{0}".format(binascii.hexlify(dpacket1))


def main() :
    demo_zlib() 

if __name__ == '__main__' :
    main() 

This should do. The zlib requires access to bytearray content, use buffer() for that.

AlGiorgio
  • 497
  • 5
  • 25
rjha94
  • 4,292
  • 3
  • 30
  • 37