2

I'm trying to send a message with non ASCII characters by socket using python 2.7 (inside a C++ program called QGIS) to a windows machine. The following code works well using a linux client machine, but does not work if I use a Windows client machine. Needless to say that I must make it work on both systems...

# -*- coding: utf-8 -*-

import socket

# message with non ASCII characters
message = u'Troço'

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('192.168.1.69',9991))

s.send(message)

s.close()

Like I said before, this works well in a linux machine, and the unicode reach the socket receiver with the right message. But, if I use it on a Windows machine I get a UnicodeDecodeError:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128).

I have read several pages and answers on this topic:

They all seem to say that I must encode my unicode basestring message to a string before sending it, but, if I replace line 11 by the following:

s.send(message.encoding('utf-8'))

Although I don't get any error message neither in windows nor in linux, the received message looks weird on that particular character, which make me think that it was uncorrectly (or double) encoded somewhere, and it can only be inside the send() method.

Which makes me thing: Is socket.send() method affected by the operating system or even the operating system default encoding?

UPDATE: Problem solved

The "problem" laid on the receiving code. I have had no access to it, but, after several tries, I realized that it expects an utf-16 encoded message. That's why sending a utf-8 message gave bad results. Therefore, changing line 11 did the trick:

s.send(message.encoding('utf-16'))

I still have no clue on why sending an unicode message worked on linux, but not on windows, but it does not matter, all makes a bit more sense now.

Community
  • 1
  • 1
Alexandre Neto
  • 207
  • 1
  • 12
  • The accepted answer to your reference http://stackoverflow.com/questions/9752521/sending-utf-8-with-sockets seems like it would work correctly. You've got a typo on the proposed encoding fix, where it should be `s.send(message.encode('utf-8'))`. Can you post a snippet with the code that consumes this string from the socket? Do you call `s.decode('utf-8')` when reading it? – Ricardo Garcia Silva Oct 06 '15 at 11:46
  • @RicardoGarciaSilva I have fixed the type. The thing is I have no control over the code that consumes the string from the socket. The only thing I know is that it works well in linux if I send a unicode, But does not work in windows. Besides, the error in windows accours in the "sending" machine not in the receiving. – Alexandre Neto Oct 06 '15 at 13:06
  • Have you double-checked that the source file is still UTF-8 encoded? – Harry Johnston Oct 06 '15 at 22:17
  • Problem solved. The "problem" was on the code that consumes the string. It assumes you are sending a utf-16 message, and try to decode it. – Alexandre Neto Oct 07 '15 at 08:18

0 Answers0