how strings are stored by python in computers?

Question

I believe most of you who are familiar with Python have read Dive Into Python 3. In chapter 4.3, it says this:

In Python 3, all strings are sequences of Unicode characters. There is no such thing as a Python string encoded in UTF-8, or a Python string encoded as CP-1252. “Is this string UTF-8?” is an invalid question.

Somehow I understand what this means: strings = characters in the Unicode set, and Python can help you encode characters according to different encoding methods. However, are characters in Pythons stored as bytes in computers anyway? For example, s = 'strings', and s is surely stored in my computer as a byte strem '0100100101...' or whatever. Then what is this encoding method used here - The "default" encoding method of Python?

Thanks!

Is there any other way to store _anything_ in anything else than bytes on a computer? — Kimvais, Mar 15 '12 at 08:13
The same question is already asked: http://stackoverflow.com/questions/1838170/what-is-internal-representation-of-string-in-python-3-x — citxx, Mar 15 '12 at 08:14

Joey · Accepted Answer · 2012-03-15T08:29:06.983

Python 3 distinguishes between text and binary data. Text is guaranteed to be in Unicode, though no specific encoding is specified, as far as I could see. So it could be UTF-8, or UTF-16, or UTF-32¹ – but you wouldn't even notice.

The main point here is: You shouldn't even care. If you want to deal with text, then use text strings and access them by code point (which is the number of a single Unicode character and independent of the internal UTF – which may organise code points in several smaller code units). If you want bytes, then use b"" and access them by byte. And if you want to have a string in a byte sequence in a specific encoding, you use .encode().

¹ Or even UTF-9, if someone is insane enough to implement Python on a PDP-10.

I have read the following chapters and I understand now. I shouldn't even care. This is a good point, thanks. — endless, Apr 01 '12 at 00:40

how strings are stored by python in computers?

1 Answers1