0

I am trying to dump a json content like:

foo = simplejson.dumps(data)

But I am seeing the following error:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xd6 in position 33: invalid continuation byte

How should I en/decode it properly ?

frazman
  • 32,081
  • 75
  • 184
  • 269
  • I've given you a generic answer, because you neglected to give us sample `data` that reproduces your problem. Please provide us with a [mcve] if you want to get specific help. – Martijn Pieters Sep 27 '15 at 00:39
  • Check this answer for a clue: http://stackoverflow.com/questions/5552555/unicodedecodeerror-invalid-continuation-byte – vakio Sep 27 '15 at 00:40

1 Answers1

1

Your data contains str objects that contain non-UTF-8 bytes. All text in JSON is Unicode, so str values are decoded to Unicode assuming UTF-8.

If that does not apply to all text in your data, you either need to decode it to Unicode before dumping to JSON, or you need to tell the dumps() function what codec to decode bytestrings with:

foo = simplejson.dumps(data, encoding='<codec for bytestrings in data>')
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343