On Python 2 REPL:
>>> sys.stdin.encoding
'UTF-8'
So my understanding is, on giving the below expression on stdin
>>> stringLiteral = 'abc'
the interpreter reads the expression from stdin
in utf-8
encoding and interprets the code.
But I learnt that, in Python 2, str
type stores 'abc'
as a byte string, and internally in CPython it's stored as a C char *
null-terminated string (i.e. an array of bytes terminated by \0
).
What encoding scheme does the str
class use to store 'abc'
in memory? What decoding scheme does str
use to print 'abc'
on printing it?
Based on the answer, If I give the below expression:
>>> stringLiteralNonAsciiRange = 'abc정정'
then why does stringLiteralNonAsciiRange
not print 정정
? Why is the output 'abc\xec\xa0\x95\xec\xa0\x95\xf0\x9f\x92\x9b'
?