5

I'm using python to play with the stackoverflow API. I run the following commands:

f = urllib.request.urlopen('http://api.stackoverflow.com/1.0/stats')
d = f.read()

The type of d is class 'bytes' and if I print it it looks like:

b'\x1f\x8b\x08\x00\x00\x00 .... etc

I tried d=f.read().decode('utf-8') as that is the charset indicated in the header, but I get a 'utf8' codec can't decode byte 0x8b in position 1" error message

How do I convert the byte object I received from my urllib.request call to a string?

amccormack
  • 13,207
  • 10
  • 38
  • 61

1 Answers1

6

Check to make sure your response body is not gzipped. Believe its transfer encoding or such for the response header, i have a high confidence that your dealing with compressed data and not character set encoding issues.

update: Realizing I have a bad habit of not explaining/providing enough detail. For Python gzip'd byte strings they always start with 1f8b Someone explains it better here https://stackoverflow.com/a/3703300/9908

Community
  • 1
  • 1
David
  • 17,673
  • 10
  • 68
  • 97
  • 1
    you got it. I solved it with the following: import zlib decompressed_data=zlib.decompress(f.read(), 16+zlib.MAX_WBITS) – amccormack Sep 19 '10 at 19:07