Get unicode text when Python encode utf-8

Question

I use Wechat Public platform to send some content. And the code is:

# coding:utf-8
import os
import urllib2
import json
import httplib2)    

content = "一些中文"

body = {
    "touser":"abcdefg",
    "msgtype":"text",
    "text":
    {
            "content": content
    }
}

access_token = '1234567890'
req_url = 'https://api.weixin.qq.com/cgi-bin/message/custom/send?access_token=' + access_token
method = 'POST'
headers['Content-Type'] = 'application/json'
headers['Accept-Charset'] = 'encoding=utf-8'
resp, token = http_obj.request(req_url, method, headers=headers, body=json.dumps(body))

I receive \u4e00\u4e9b\u4e2d\u6587 not when I run 一些中文 the program. What should I do if I want to receive 一些中文?? Thanks a lot !

What is the purpose of `.encode(type)`? `type` is a reserved Python word (and I also don't see you define it anywhere) — Luigi, Apr 14 '14 at 03:10
Also, yuck. Python doesn't like your "some Chinese". It gives me an unsupported type error :( — Luigi, Apr 14 '14 at 03:12
Thanks for your comment. I edited the code just now, and remove some code. But it doesn't effective. — changzhi, Apr 14 '14 at 03:20
Try using `gbk` if you're on Windows for your Chinese encoding. See this answer for more: http://stackoverflow.com/questions/2688020/how-to-print-chinese-word-in-my-code-using-python — Luigi, Apr 14 '14 at 04:33
Using decode for pythons strings, on the receiving end, might be relevant, see this answer (though about Hebrew characters, it's about the same with needing to print out Unicode): http://stackoverflow.com/questions/18079690/using-hebrew-on-python/18080334#18080334 — Eran, Apr 14 '14 at 04:46
What's the problem? In JSON, `"\u4e00\u4e9b\u4e2d\u6587"` is absolutely identical to `"一些中文"`. Note also that the `Accept-Charset` header, if passed, should not contain `encoding=`; it should be superfluous in any case as UTF-8 is the default encoding for JSON (and the header itself is somewhat archaic). — bobince, Apr 14 '14 at 12:22

score 0 · Accepted Answer · answered Apr 14 '14 at 04:26

0

You can dump the body as following:

json.dumps(body, ensure_ascii=False)

From python docs (https://docs.python.org/2/library/json.html):

If ensure_ascii is True (the default), all non-ASCII characters in the output are escaped with \uXXXX sequences, and the result is a str instance consisting of ASCII characters only. If ensure_ascii is False, some chunks written to fp may be unicode instances. This usually happens because the input contains unicode strings or the encoding parameter is used. Unless fp.write() explicitly understands unicode (as in codecs.getwriter()) this is likely to cause an error.

answered Apr 14 '14 at 04:26

garry

24
3

json.dumps(body, encoding='utf-8', ensure_ascii=False) – changzhi Apr 14 '14 at 06:20
The default encoding is 'utf-8'. `json.dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, encoding="utf-8", default=None, sort_keys=False, **kw)` – garry Apr 14 '14 at 09:22

Get unicode text when Python encode utf-8

1 Answers1