0

I'm desesperately trying to send over tcp (Python 2.7) a json string that is utf8-encoded. Here are a few tries and the results. The variable reponse contains the json string I'm trying to send:

reponse = {"candidats":{"P":[{"mentionname":"Beyoncé","guess":[{"name":"BEYONCÉ","score":"1.00","eid":"72437"}]}],"E":[]}}

The command 1:

self.request.sendall(json.dumps(reponse+"\n",ensure_ascii=False))

results in error:

    'ascii' codec can't encode character u'\xe9' in position 49: ordinal not in range(128)

The command 2:

self.request.sendall(json.dumps(reponse+"\n",encoding='utf8')):

gives an output at the other end (tcp client) but last character of Beyoncé is not the good one :

   "{\"candidats\":{\"P\"[{\"mentionname\":\"Beyonc\u00e9\",\"guess\":[{\"name\":\"BEYONC\u00c9\",\"score\":\"1.00\",\"eid\":\"72437\"}]}],\"E\":[]}}\n"

(message is received in the client with message.decode('UTF-8')).

The command 3:

self.request.sendall(json.dumps(reponse+"\n",ensure_ascii=False,encoding='utf8')):

results in error:

    'ascii' codec can't encode character u'\xe9' in position 49: ordinal not in range(128)

The command 4:

self.request.sendall(json.dumps(reponse+"\n").encode('utf8')):

gives an output at the other end (tcp client) but last character of Beyoncé is not the good one:

    "{\"candidats\":{\"P\":[{\"mentionname\":\"Beyonc\u00e9\",\"guess\":[{\"name\":\"BEYONC\u00c9\",\"score\":\"1.00\",\"eid\":\"72437\"}]}],\"E\":[]}}\n"

The command 5:

self.request.sendall(json.dumps(reponse+"\n",ensure_ascii=False).encode('utf8')):

gives an output at the other end, last character of Beyoncé is the good one but double quote are escaped:

    "{\"candidats\":{\"P\":[{\"mentionname\":\"Beyoncé\",\"guess\":{\"name\":\"BEYONCÉ\",\"score\":\"1.00\",\"eid\":\"72437\"}]}],\"E\":[]}}\n"

Last try is almost the good one, except for those annoying escaped double quotes. I know that this is because string is double encoded but I have no other choice for the moment to choose this solution and eliminate backslashes in my tcp client code.

Does anybody have a better solution? Any hint is greatly appreciated! Regards, Patrick

Community
  • 1
  • 1
Patrick
  • 2,577
  • 6
  • 30
  • 53

1 Answers1

0

It seems like you are putting the text into the response variable in the source, they setting the source file encoding so that the first two lines in the source file reads:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

More info can be found in PEP 0263.

Donovan Solms
  • 941
  • 12
  • 17
  • Thanks Donovan but could you explain in more details your answer, I don't understand it. – Patrick Feb 04 '15 at 16:55
  • If you meant removing the first 2 lines (the source file encoding), it does not change anything to the results. But maybe that's not what you meant. – Patrick Feb 04 '15 at 17:28
  • I meant adding the `coding: utf8` piece, since the string is in the source file. Did the setting not work? – Donovan Solms Feb 04 '15 at 18:17
  • The coding:utf8 piece was already there. I thought you were saying to remove it. – Patrick Feb 04 '15 at 18:52