I was reading Python guide about Unicode. In this section, it says:
To summarize the previous section: a Unicode string is a sequence of code points, which are numbers from 0 to 0x10ffff. This sequence needs to be represented as a set of bytes (meaning, values from 0-255) in memory. The rules for translating a Unicode string into a sequence of bytes are called an encoding.
The first encoding you might think of is an array of 32-bit integers. In this representation, the string “Python” would look like this:
P y t h o n
0x50 00 00 00 79 00 00 00 74 00 00 00 68 00 00 00 6f 00 00 00 6e 00 00 00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Why might we think of 32-bit integers if code points are numbers from 0 to 0x10ffff? Maybe is it assuming that we are on a 32-bit system?