21

Below is the test program, including a Chinese character:

# -*- coding: utf-8 -*-
import json

j = {"d":"中", "e":"a"}
json = json.dumps(j, encoding="utf-8")

print json

Below is the result, look the json.dumps convert the utf-8 to the original numbers!

{"e": "a", "d": "\u4e2d"}

Why this is broken? Or anything I am wrong?

Bin Chen
  • 61,507
  • 53
  • 142
  • 183
  • 8
    first of all : __don't name your var json__ you will not be able to access json lib anymore after that , second of all nothing is broken, beside the default json encoding is utf-8 so you don't have to add it in dumps() args . – mouad Nov 15 '10 at 12:09
  • if I can accept it to turn binary utf-8 data into "\u4e2d", how can I convert it back to binary utf-8 in the javascript, which is the client receiving this data? – Bin Chen Nov 15 '10 at 12:12
  • have you try it ?, i mean send it like that to the browser i think if you're using a sophisticate javascript lib it will know what to do with it. – mouad Nov 15 '10 at 12:24
  • See [my answer to How do I write JSON data to a file in Python?](http://stackoverflow.com/a/37795053/562769) – Martin Thoma Feb 10 '17 at 11:54

3 Answers3

61

Looks like valid JSON to me. If you want json to output a string that has non-ASCII characters in it then you need to pass ensure_ascii=False and then encode manually afterward.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
29

You should read json.org. The complete JSON specification is in the white box on the right.

There is nothing wrong with the generated JSON. Generators are allowed to genereate either UTF-8 strings or plain ASCII strings, where characters are escaped with the \uXXXX notation. In your case, the Python json module decided for escaping, and has the escaped notation \u4e2d.

By the way: Any conforming JSON interpreter will correctly unescape this sequence again and give you back the actual character.

Boldewyn
  • 81,211
  • 44
  • 156
  • 212
9

Use simplejson with the mentioned options:

# -*- coding: utf-8 -*-
import simplejson as json

j = {"d":"中", "e":"a"}
json = json.dumps(j, ensure_ascii=False, encoding="utf-8")

print json

Outs:

{"e": "a", "d": "中"}
alemol
  • 8,058
  • 2
  • 24
  • 29
  • 1
    There is no need for this; the standard library `json` module offers the same options. – Karl Knechtel Jul 03 '22 at 22:52
  • if python 2 use `import simplejson as json ` for python 3 use `import json` – alemol Feb 24 '23 at 20:45
  • No, that is wrong; `json` [has been available in the standard library since 2.6, and has had these options the entire time](https://docs.python.org/2.6/library/json.html#basic-usage). – Karl Knechtel Feb 27 '23 at 06:13