I wish to seek some clarifications on Unicode and str methods in Python. After reading some explanation on Unicode, there are still couple of doubts I hope folks can help me on:
Am I right to say that when declaring a unicode string e.g
word=u'foo'
, python uses the encoding of the terminal and decodesfoo
in e.gUTF-8
, and assigningword
the hex representation in unicode?So, in general, is the process of printing out characters in a file, always decoding the byte stream according to the encoding to unicode representation, before displaying the mapped characters out?
In my terminal, Why does
'é'.lower()
orstr('é')
displays in hex'\xc3\xa9'
, whereas 'a'.lower() does not?