An ordinary character literal (one without prefix) usually has type char
and can store only elements of the execution character set that are representable as a single byte.
If the character is not representable in this way, the character literal is only conditionally-supported with type int
and implementation-defined value. Compilers typically warn when this happens with some of the generic warning flags since it is a mistake most of the time. That might depend on what warning flags exactly you have enabled.
A byte is typically 8 bits and therefore it is impossible to store all of unicode in it. I don't know what execution character set your implementation uses, but clearly neither ©
nor ™
are in it.
It also seems that your implementation chose to support the non-representable character by encoding it in UTF-8 and using that as the value of the literal. You are seeing a representation of the numeric value of the UTF-8 encoding of the two characters.
If you want the numeric value of the unicode code point for the character, then you should use a character literal with U
prefix, which implies that the value of the character according to UTF-32 is given with type char32_t
, which is large enough to hold all unicode code points:
std::cout << std::hex << static_cast<std::uint_least32_t>(U'©');