There are different encodings of the same Unicode (standardized) table. For example for UTF-8 encoding A
corresponds to 0x0041
but for UTF-16 encoding the same A
is represented as 0xfeff0041
.
From this brilliant article I have learned that when I program by C++ for Windows platform and I deal with Unicode that I should know that it is represented in 2 bytes. But it does not say anything about the encoding. (Even it says that x86 CPUs are little-endian so I know how those two bytes are stored in memory.) But I should also know the encoding of the Unicode so that I have a complete information about how the symbols are stored in memory. Is there any fixed Unicode encoding for C++/Windows programmers?