Can the encoding used to decompress the data automatically in Requests be altered?

Question

I used Requests module in Python, and sent a request and got a response.

The response seems to be gzip, given that r.headers has the following information:

{'Date': 'Tue, 06 Dec 2016 17:35:44 GMT', 'Content-Type': 'application/json', 'Content-Length': '21632', 'Cache-Control': 'public,max-age=3600', 'Content-Encoding': 'gzip', 'Vary': 'Accept-Encoding'}

However, while Requests decides its encoding and returns the decompressed data, when I tried to check out r.text, it got UnicodeEncodeError: 'ascii' codec can't encode character XXX.

Digging it out further, I found that there is one encoding error in the response (r.content, which returns the data as bites).

Currencies":[{"Code":"JPY","Symbol":"\xc2\xa5","ThousandsSeparator":",...

However, I'm not sure how I can process the mistakedn encoding error. Even if I set its encoding to utf-8 (r.encoding = "utf-8"), I still got the UnicodeEncodeError. I feel that it was gzip that caused its problem which Requests proceessed automatically, but how can I correct its wrong yet automatically done decoding?

I sent a request to a SkyScanner's API, with a following code:

headers = {
    "Content-Type": "application/x-www-form-urlencoded",
    "Accept": "application/json",
    "charset": "utf-8"
}
r = requests.get(url, headers=headers)

What am I missing here?

@pradyunsg Sorry the title of the question is too misleading and I just edited. — Blaszard, Dec 06 '16 at 18:34

score 3 · Accepted Answer · answered Dec 06 '16 at 18:25

The requests.Response object has a raw attribute which allows you to get the raw socket response.

So, you need to do:

r = requests.get(url, headers=headers, stream=True)
r.raw.read()

Source: http://www.python-requests.org/en/latest/user/quickstart/#raw-response-content

Can the encoding used to decompress the data automatically in Requests be altered?

1 Answers1