C++20 added char8_t
, which is (I believe) designed to help support UTF-8 better.
String constants of the form u8"abc"
are required by the standard to be valid UTF-8 in a char8_t[]
array. These constants can also be turned into std::u8string
s.
However, I can find nothing in the C++ standard which suggests that a std::u8string
either must, or even should, contain a UTF-8 string. Is there in practice any difference between a std::string
and std::u8string
in terms of UTF-8 support?