7

I don't understand in 2.X it worked :

import zlib
zlib.compress('Hello, world')

now i have a :

zlib.compress("Hello world!")
TypeError: must be bytes or buffer, not str

How can i compress my string ? Regards Bussiere

Bussiere
  • 500
  • 13
  • 60
  • 119
  • 1
    Compression always works on sequences of bytes, but you need to convert to them first (i.e., pick an encoding for the characters as bytes). – Donal Fellows Oct 01 '10 at 13:01
  • possible duplicate of [TypeError: 'str' does not support the buffer interface](http://stackoverflow.com/questions/5471158/typeerror-str-does-not-support-the-buffer-interface) – NoDataDumpNoContribution Oct 06 '14 at 19:59

2 Answers2

20

This is meant to enforce that you actually have a defined encoding.

zlib.compress("Hello, world".encode("utf-8"))
b'x\x9c\xf3H\xcd\xc9\xc9\xd7Q(\xcf/\xcaI\x01\x00\x1b\xd4\x04i'
zlib.compress("Hello, world".encode("ascii"))
b'x\x9c\xf3H\xcd\xc9\xc9\xd7Q(\xcf/\xcaI\x01\x00\x1b\xd4\x04i'

The same string could describe different byte sequences otherwise. But it is actually a byte sequence that will be encoded by zlib.

>>> zlib.compress("Hello, wørld".encode("utf-16"))
b'x\x9c\xfb\xff\xcf\x83!\x95!\x07\x08\xf3\x19t\x18\x14\x18\xca\x19~0\x14\x01y)\x0c\x00n\xa6\x06\xef'
>>> zlib.compress("Hello, wørld".encode("utf-8"))
b"x\x9c\xf3H\xcd\xc9\xc9\xd7Q(?\xbc\xa3('\x05\x00#\x7f\x05u"
relet
  • 6,819
  • 2
  • 33
  • 41
19

In python 2.x strings are bytes string by default. In python 3.x they are unicode strings.

Compressing needs a byte string.

Douglas Leeder
  • 52,368
  • 9
  • 94
  • 137