14

Recently I had a problem with porting a Windows application to Linux because of the wchar_t size difference between these platforms. I tried to use compiler switches, but there were problems with printing those characters (I presume that GCC wcout thinks that all wchar_t are 32bit).

So, my question: is there a nice way to (w)cout char16_t? I ask because it doesn't work, I'm forced to cast it to wchar_t:

cout << (wchar_t) c;

It doesn't seem like a big problem, but it bugs me.

Marc Mutz - mmutz
  • 24,485
  • 12
  • 80
  • 90
NoSenseEtAl
  • 28,205
  • 28
  • 128
  • 277
  • What exactly is causing the size problem (cause there shouldn't be any if used correctly)? Casting to `wchar_t` won't work. – Šimon Tóth Apr 10 '11 at 12:48
  • wchar is 32 bits in GCC, 16 bit in win and asm lib that I used was written presuming 16bit wchar. So i decided to escape portability problems by using a type that was guaranteed to be 16bit. And it works, but wcout won't print char16_t. – NoSenseEtAl Apr 10 '11 at 12:54
  • @NoSenseEtAI Yes, Windows is breaking the standard with a 16bit `wchar_t` and UTF-16 encoding. But that has nothing to do with size assumptions. 64bit systems have different sized types then 32bit systems, but that doesn't mean that your code will break on them. The lib you are mentioning works on Linux? – Šimon Tóth Apr 10 '11 at 13:00
  • 1
    What exactly are you trying to do? Does your output (terminal?) even expect 2- or 4-byte characters? If it's text processing and your terminal expects UTF8, maybe better to convert your data stream into UTF8 and just emit ordinary chars. – Kerrek SB Apr 10 '11 at 13:17
  • Lib works on Linux, it is asm code, the point is that functions return pointer to wchar array. And the functions are asm and they presume that wchar is 16 bit. To be honest i prefer it that way. BTW my question is about couting char16_t, not about making my code work on linux. I solved that by using char16_t. My question is how to cout char16_t, because wcout doesnt work. – NoSenseEtAl Apr 10 '11 at 13:20
  • 1
    @Let_Me_Be - Windows (like Java) isn't breaking any standards, as 16 bits **was** the standard when those systems were designed. You can't blame them for Unicode standards changing afterwards! – Bo Persson Apr 10 '11 at 19:23
  • 1
    @Bo Java can't logically break C++ standard, since it's Java. Windows implementation of C++ can. And btw. old version of Windows didn't break the standard since they used 16bit with UCS-2 encoding (which is perfectly OK). – Šimon Tóth Apr 10 '11 at 19:40
  • 2
    @Let_Me_Be - I assumed that was about the Unicode standard, as you cannot easily "break" the C++ standard that doesn't say anything about the size or encoding of a wchar_t. – Bo Persson Apr 10 '11 at 20:58
  • 1
    @Bo The C++ standard requires one character to be represented by one `wchar_t`. Microsoft ignores the entire C part of the C++ standard and also redefines the meaning "string length" which means number of `wchar_t` not number of characters. This was already discussed many times here on stackoverflow. – Šimon Tóth Apr 10 '11 at 21:07
  • OK, that MS part is all cool and interesting and infuriating , but does anybody knows how to (w)cout char16_t. :D BTW I blame the standard, not MS-it's the same stuff like long... there should be 64 bit integer, there should be 16 bit char... – NoSenseEtAl Apr 11 '11 at 00:36
  • @NoSenseEtAl - There is a limitation here, as you have discovered. We get new char types char16_t and char32_t, plus std::strings with those characetrs. However, we still only have cout and wcout, which don't work directly for those character types. Nobody proposed enough extensions to iostreams and locales to make that happen. – Bo Persson Apr 11 '11 at 07:02
  • 1
    @Bo Persson Tnx for the answer. That is awful.. I mean its really bad... not bad but really ugly. – NoSenseEtAl Apr 11 '11 at 09:49
  • I agree. It's utterly ludicrous. – Lightness Races in Orbit Jul 07 '11 at 22:47
  • 9
    The inability to print char16_t and char32_t is really embarrassing for C++11. u16cout and u32cout is badly needed. – Ricky65 Oct 04 '11 at 23:28
  • 1
    Couldn, agree more... this is just awfull... using pointers: void print_char16_t_array(const char16_t * str) { size_t len=char_traits::length(str); assert(len<=1024); for(int i=0;i – NoSenseEtAl Oct 05 '11 at 10:13

1 Answers1

5

Give this a try:

#include <locale>
#include <codecvt>
#include <string>
#include <iostream>

int main()
{
    std::wstring_convert<std::codecvt_utf8_utf16<wchar_t> > myconv;
    std::wstring ws(L"Your UTF-16 text");
    std::string bs = myconv.to_bytes(ws);
    std::cout << bs << '\n';
}
Howard Hinnant
  • 206,506
  • 52
  • 449
  • 577
  • 15
    So you are saying that there is no std function that can print char16_t? I mean without conversions... It's strange esp if you consider that it took them half of the age of the universe to finalize c++0x revision. – NoSenseEtAl Apr 11 '11 at 07:41
  • 3
    We were waiting for you to design, test, implement, get field experience, propose, and then shepherd the proposal through the standardization process. We were busy doing other stuff. – Howard Hinnant Apr 11 '11 at 14:41
  • 5
    Oh the flame wars... like I said it's not a core language feature and it is sad and funny at the same time that you cant printout one of the built in types. Does boost has built in function for printing char16_t? – NoSenseEtAl Apr 11 '11 at 19:25
  • 4
    yeah, if only `#include ` would actually work with `gcc` that would be great ;) – Paweł Prażak Feb 26 '16 at 18:19