1

I'm trying to read in a response from a REST API, parse it as JSON and write the properties to a CSV file.

It appears some of the characters are in an unknown encoding and can't be converted to strings when they're written out to the CSV file:

'ascii' codec can't encode character u'\xf6' in position 15: ordinal not in range(128)

So, what I've tried to do is follow the answer by "agf" on this question: UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128)

I added a call to unicode(content).encode("utf-8") when my script reads the contents of the response:

obj = json.loads(unicode(content).encode("utf-8"))

Now I see a exceptions.UnicodeDecodeError on this line.

Is Python attempting to decode "content" before encoding it as utf-8? I don't quite understand what's going on. There is no way to determine the encoding of the response since the API I'm calling doesn't set a Content-Type header.

Not sure how to handle this. Please advise.

Community
  • 1
  • 1
Nate Reed
  • 6,761
  • 12
  • 53
  • 67
  • Hang on, if the problem is in the output, why are you encoding the input? Simplest option is probably just to write the csv file as binary. But at some point you probably want to figure out the actual encoding... – james.haggerty Dec 16 '14 at 03:25
  • Of course it's trying to decode `content`. What did you think passing it to the `unicode` constructor would do? – Ignacio Vazquez-Abrams Dec 16 '14 at 03:29
  • encode before writerow – Binux Dec 16 '14 at 03:40
  • james.haggerty: Because the general advice is to avoid these errors, always work with unicode. "a good rule of thumb I was taught is to use the 'unicode sandwich' idea. Your script accepts bytes from the outside world, but all processing should be done in unicode. Only when you are ready to output your data should it be mushed back into bytes!" – Nate Reed Dec 16 '14 at 13:33

0 Answers0