json.dump() in Python 2.7 dumps text encoded

Question

I have a web scraper application that scraps some Japanese site. The site has UTF-8 encoded Japanese characters. For example,

2017-03-02 17:14:17,862 - __main__ - DEBUG - 出演者: 青山茉利奈
2017-03-02 17:14:17,862 - __main__ - DEBUG - 作者: ひつき
2017-03-02 17:14:17,862 - __main__ - DEBUG - 収録時間: 123分

As you can see, when I do logger.debug() in the code, the characters are printed on screen correctly. But when I use json.dump() to dump this data in a json text file, the strings are encoded to something like

"\u53ce\u9332\u6642\u9593": "123\u5206",

This is not what I want. What I want is exactly what I see in the debug log. How can I solve this problem?

This has been answered on the site before, here: http://stackoverflow.com/a/18337754/1759987 — Aaron, Mar 02 '17 at 22:51
The solution in this link works. I dump the object to a json string, and then save it to a file. — fhcat, Mar 02 '17 at 23:09
Possible duplicate of [Saving utf-8 texts in json.dumps as UTF8, not as \u escape sequence](http://stackoverflow.com/questions/18337407/saving-utf-8-texts-in-json-dumps-as-utf8-not-as-u-escape-sequence) — Zero Piraeus, Mar 02 '17 at 23:43

score -1 · Answer 1 · answered Mar 02 '17 at 22:51

-1

json.dumps(whatever, ensure_ascii=False)

Specify ensure_ascii=False to disable \u escaping. Note that if the presence of this escaping is actually causing you problems, whatever code needs to receive this JSON is broken.

answered Mar 02 '17 at 22:51

user2357112

260,549
28
431
505

Thanks, now I get `UnicodeEncodeError: 'ascii' codec can't encode characters in position 4-26: ordinal not in range(128)` – fhcat Mar 02 '17 at 23:02

json.dump() in Python 2.7 dumps text encoded

1 Answers1