24

I am trying to utilize Zlib for text compression.

For example I have a string T='blah blah blah blah' I need to compress it for this string. I am using S=zlib.compress(T) to compress it. Now what I want is to get the non-binary form of S so that I can decompress T but in a different program.

Thanks!

EDIT: I guess I got a method to solve what I wanted. Here is the method:

import zlib, base64
text = 'STACK OVERFLOW STACK OVERFLOW STACK OVERFLOW STACK OVERFLOW STACK OVERFLOW STACK OVERFLOW STACK OVERFLOW STACK OVERFLOW STACK OVERFLOW STACK OVERFLOW '
code =  base64.b64encode(zlib.compress(text,9))
print code

Which gives:

eNoLDnF09lbwD3MNcvPxD1cIHhxcAE9UKaU=

Now I can copy this code to a different program to get the original variable back:

import zlib, base64
s='eNoLDnF09lbwD3MNcvPxD1cIHhxcAE9UKaU='
data = zlib.decompress(base64.b64decode(s))
print data

Please suggest if you are aware of any other compression method which would give better results that are consistent with the above code.

beroe
  • 11,784
  • 5
  • 34
  • 79
Quixotic
  • 2,424
  • 7
  • 36
  • 58
  • What prevents you from using [zlib.decompress()](http://docs.python.org/library/zlib.html#zlib.decompress) in that other program? – Frédéric Hamidi Jan 30 '11 at 20:36
  • 1
    Are you going to accept my answer to your previous question? That might encourage me to help you with this new question. I now understand what you are getting at. – David Heffernan Jan 30 '11 at 20:36
  • How can I print S so that I can use it in another program ? – Quixotic Jan 30 '11 at 20:39
  • 1
    Note compressing really small strings, the overhead with the compressed data is likely to be longer than the original string... – Matt Billenstein Jan 30 '11 at 21:12
  • `brotli.decompress(base64.b64decode(base64.b64encode(brotli.compress("payloadpayload..".encode())).decode())).decode()` gives slightly better compression ratios and returns a string in Python 3. Don't understand the basse64 encoding/decoding though - be very grateful for an explanation or approach that makes more sense. (I am using the compressed string as a cache key so need a string.) – Chris Feb 17 '22 at 12:17

2 Answers2

15

Program 1:

T = 'blah blah blah blah'
S = zlib.compress(T)
with open("temp.zlib", "wb") as myfile:
    myfile.write(S)

This saves the compressed string in a file called temp.zlib so that program 2 can later retrieve and decompress it.

Program 2:

with open("temp.zlib", "rb") as myfile:
    S = myfile.read()
T = zlib.decompress(S)
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • Yes, this is very near to what I want, but what I precisely need is to do it with the help of two files only,without the use of the third file. – Quixotic Jan 30 '11 at 20:52
  • What third file? Here there's only one file. If you're counting your applications as files, then sure, this is a 3rd thing, but if you don't ever want to serialize the data to disk you're going to have to provide us with a lot more input into how your system works. – Nick Bastin Jan 30 '11 at 20:56
  • @Nick Bastin:Check out http://stackoverflow.com/questions/4844907/text-compression-in-python/4844924#4844924, Lennart Regebro answer, what I am not getting how does he got that compressed value in that form. – Quixotic Jan 30 '11 at 20:59
  • I still don't understand the problem. You want to exchange data between two different programs but without generating an actual representation of that data (like a file)? – Tim Pietzcker Jan 30 '11 at 21:07
  • @Tretwick that's easy, he did: `base64.b64encode(zlib.compress(text))`, just the operations inverted, and in reverse order – David Heffernan Jan 30 '11 at 21:10
  • @ David Heffernan:Thanks that helps a lot now I understand my mistake previously :) – Quixotic Jan 30 '11 at 21:38
  • @Tretwick Marian: I'm confused with what you mean by two files only. Are you perhaps looking for something where you have two processes and one sends the compressed text and othe one decrompess it? Then this may make sense with your earlier question. Please just explain what is your over all goal you are trying to achieve. – eat Jan 30 '11 at 21:38
  • TEXT to ASCII then that ASCII to text back. – Quixotic Jan 30 '11 at 21:51
  • @Tretwick: You're not making any sense. So now it's an encoding problem? In that case, `base64`, as previously suggested, is what you're looking for. – Tim Pietzcker Jan 30 '11 at 22:08
  • I have got my answer :) If you are aware of a better method please suggest. – Quixotic Jan 30 '11 at 22:15
  • 8
    This was probably true back in the day. With python 3.6.5: `TypeError: a bytes-like object is required, not 'str'` – matanster May 04 '19 at 17:22
0

Following the comments from the accepted answer, for python 3 users, according to zlib documentation:

def compress(data, /, level=-1)
    Returns a bytes object containing compressed data.

    data
      Binary data to be compressed.
    level
      Compression level, in 0-9 or -1.
(END)

Meaning the first param must be bytes and note that "T" is a string, not bytes. Simply use .encode() from str type to return a copy of this string encoded to bytes, eg:

T = 'blah blah blah blah'
S = zlib.compress(T.encode())

This explains the error TypeError: a bytes-like object is required, not 'str' and fix it.

Jonathan Simon Prates
  • 1,122
  • 2
  • 12
  • 28