Decoding bytes to string in python

Question

i've got a row of bytes: '\udcd0\udca0\udcd0\udcbe\udcd1\udc81\udcd0\udcbd\udcd0\udcb5\udcd1\udc84\udcd1\udc82\udcd1\udc8c'

If i do:

b'\udcd0\udca0\udcd0\udcbe\udcd1'.decode("utf8"),

I recieve:

'\\udcd0\\udca0\\udcd0\\udcbe\\udcd1'

I cant decode it, because i dont know, how it was encoded. At least, we can see, that its not utf-8, because, symbols i expect to see, have a \x23-similar representation. How can i discover the decoder and decode it?

P.S. i expect to see russian symbols there

http://stackoverflow.com/questions/436220/python-is-there-a-way-to-determine-the-encoding-of-text-file — jkr, Nov 05 '16 at 21:28
@Jakub Thank you very much, but for some reason, i cant install any of suggested libraries. Are there any other ways? — , Nov 05 '16 at 21:39

jkr · Accepted Answer · 2016-11-05T21:53:00.667

0

I am able to print your string in this way, but the output is all "invalid characters."

>>> string = u'\udcd0\udca0\udcd0\udcbe\udcd1\udc81\udcd0\udcbd\udcd0\udcb5\udcd1\udc84\udcd1\udc82\udcd1\udc8c'
>>> print string
����������������

According to Charbase.com, your first character (u'\udcd0') is invalid character. So maybe the output is correct.

edited Nov 05 '16 at 21:53

answered Nov 05 '16 at 21:50

jkr

17,119
2
42
68

Decoding bytes to string in python

1 Answers1