std::stringstream ss;
ss << "hello " << L"world";
std::cout << ss;
The above code seems to compile well - but my tests show that wide string falls back to serializing the pointer value. Then I did:
std::wstringstream wss;
wss << "hello " << L"world";
std::wcout << wss.str() << std::endl;
Changing to std::wstringstream
I was expecting the situation to reverse (getting pointer value of "hello "
) - but apparently the string is fine when printed with std::wcout
. Now I'm aware this may work because no conversion is needed because the representation is the same eg. 0x20 -> 0x0020 (16 bit wchar_t) - at least for this simple string.
It turns out this is because std::basic_ostream
has specific overload for const char*
.
In what situations are this 'no-conversion' benign? It seems to me that only for basic Latin is it OK. UTF8 will not convert correctly and probably neither will language extensions (extended ASCII) convert benign in this way. On the same token you cannot assign a std::string
to std::wstring
(directly at least) and that is a good thing. It starts to look like a bad idea actually to allow this, so why allow it and why is it a good idea to allow it? and why not make the wchar_t* malformed in the first example?
With c++20 there's a UTF8 char type coming, that can help to solve some of these issues to large extent if we are to use that string type when utf8 is involved. But this isn't the case here and probably will not be if you are converting to UTF8 on a large scale.
At first, It seems like, I cannot use std::stringstream
(in use now) and are forced into converting to std::wstringstream
to avoid the pointer fallback issue (we are dealing with legacy code) - on the other hand std::wstringstream
does not convert utf8 correctly which is even worse since it seemingly may work for basic Latin and then the issue may be hidden for a longer period. I also feels overkill to re-invent all the stream classes, etc. for this particular issue (and comes with it own set of pitfalls). So I guess the last question is if anyone else had this issue and what, if any, is a sound solution that eg. provides a malformed program for the malign situations ?
I'm interested in any answer that tries to answer at least one of these 3 questions.