I'm trying to send a message with non ASCII characters by socket using python 2.7 (inside a C++ program called QGIS) to a windows machine. The following code works well using a linux client machine, but does not work if I use a Windows client machine. Needless to say that I must make it work on both systems...
# -*- coding: utf-8 -*-
import socket
# message with non ASCII characters
message = u'Troço'
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('192.168.1.69',9991))
s.send(message)
s.close()
Like I said before, this works well in a linux machine, and the unicode reach the socket receiver with the right message. But, if I use it on a Windows machine I get a UnicodeDecodeError:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128).
I have read several pages and answers on this topic:
- Sending UTF-8 with sockets
- https://docs.python.org/2/library/socket.html#socket.socket.recv
- How to handle Unicode (non-ASCII) characters in Python?
They all seem to say that I must encode my unicode basestring message to a string before sending it, but, if I replace line 11 by the following:
s.send(message.encoding('utf-8'))
Although I don't get any error message neither in windows nor in linux, the received message looks weird on that particular character, which make me think that it was uncorrectly (or double) encoded somewhere, and it can only be inside the send() method.
Which makes me thing: Is socket.send() method affected by the operating system or even the operating system default encoding?
UPDATE: Problem solved
The "problem" laid on the receiving code. I have had no access to it, but, after several tries, I realized that it expects an utf-16 encoded message. That's why sending a utf-8 message gave bad results. Therefore, changing line 11 did the trick:
s.send(message.encoding('utf-16'))
I still have no clue on why sending an unicode message worked on linux, but not on windows, but it does not matter, all makes a bit more sense now.