I am trying to receive data on a socket. It will be a mix of UTF-8 and UTF-16 depending on what is sent to me. I am trying to find a way to detect if it is UTF-8/UTF-16 but am running into a issue.
data = b"\x00D\x00E\x00S\x00K\x00T\x00O\x00P\x00-\x00\x15\x04\x19\x04\x19\x04'\x04\x13\x04\x14\x04\x14\x04\x00\x00"
def is_ascii(s):
return all(ord(c) < 128 for c in s)
def print_to_screen(data):
if is_ascii(str(data)):
print("RECV 8: " + data.decode())
else:
print("RECV 16: " + data.decode('utf-16'))
The data should be: DESKTOP-ЕЙЙЧГДД
It is always printing as if it is UTF-8. I am not sure if I need to alter is_ascii
or find another way to do what I am doing.
EDIT:
data = b"D\x00E\x00S\x00K\x00T\x00O\x00P\x00-\x00\x15\x04\x19\x04\x19\x04'\x04\x13\x04\x14\x04\x14\x04\x00\x00"
try:
data = data.decode('utf-8')
except:
data = data.decode('utf-16')
print(data)
It will convert half of the data which will print DESKTOP- and it won't decode the other half.