0

In C++11 and later, using the u8 prefix on a string literal can create char (byte) sequences that are UTF-8 encoded.

How do you output those sequences to a std::ostream? How do you tell a std::ostream that a const char * or std:string to be output contains characters encoded in UTF-8, rather than the default encoding?

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
Raedwald
  • 46,613
  • 43
  • 151
  • 237
  • With a mixture of depression, bemusement, and anger, I am learning that [support for Unicode in standard C++ is terrible](https://stackoverflow.com/a/17106065/545127). – Raedwald Dec 12 '17 at 22:57

1 Answers1

1

You don't. The stream does not know or care what the encoding of the text is. Despite it's name, a char is not treated by std:ostream as containing a character encoded in the platform encoding. It must treats a char as a byte to be written out. It writes the "text" (byte sequence) as given (outside of possibly performing \n translation), assuming you don't imbue it with a facet that changes this. If you write characters that conform to UTF-8, then that's what ends up in the output.

Raedwald
  • 46,613
  • 43
  • 151
  • 237
Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Are you saying that a `std::ostream` treats a `std:string` as a sequence of bytes (one byte per `char`), rather than a sequence of characters in the platform encoding? – Raedwald Dec 08 '17 at 00:05
  • @Raedwald: Yes. It's just a sequence of `char`. What else would it treat the input as? – Nicol Bolas Dec 08 '17 at 00:16