1

So I already know how to convert wstring to string (How to convert wstring into string?).

However, I would like to to know whether it is safe to make the conversion, meaning, the wstring variable does not contain any characters that are not supported in string type.

Community
  • 1
  • 1
idanshmu
  • 5,061
  • 6
  • 46
  • 92
  • Just convert it with whatever standard method you are using (wcstombs or wstring_convert). You will get back the length of the prefix that was actually converted. If it's the entirety of your wstring, you are set. If it is not, you know exactly where the problematic character is. – n. m. could be an AI Oct 26 '14 at 09:56

1 Answers1

1

strings can hold any data, if you use the right encoding. They are just sequences of bytes. But you need to check with your particular encoding / conversion routine.

Should be simply a matter of round-tripping. An elegant solution to many things.

Warning, Pseudo-code, there is no literal convert_to_wstring() unless you make it so:

if(convert_to_wstring(convert_to_string(ws)) == ws)
    happy_days();

If what goes in comes out, it is non-lossy (at least for your code points).

Not that its the most efficient solution, but should allow you to build from your favorite conversion routines.

// Round-trip and see if we lose anything
bool check_ws2s(const std::wstring& wstr)
{
    return (s2ws(ws2s(str)) == wstr);
}

Using @dk123's conversions for C++11 at How to convert wstring into string? (Upvote his answer here https://stackoverflow.com/a/18374698/257090)

wstring s2ws(const std::string& str)
{
    typedef std::codecvt_utf8<wchar_t> convert_typeX;
    std::wstring_convert<convert_typeX, wchar_t> converterX;

    return converterX.from_bytes(str);
}

string ws2s(const std::wstring& wstr)
{
    typedef std::codecvt_utf8<wchar_t> convert_typeX;
    std::wstring_convert<convert_typeX, wchar_t> converterX;

    return converterX.to_bytes(wstr);
}

Note, if your idea of conversion is truncating the wide chars to chars, then it is simply a matter of iterating and checking that each wide char value fits in a char. This will probably do it.

WARNING: Not appropriate for multibyte encoding.

for(wchar_t& wc: ws) {
    if(wc > static_cast<char>::(wc))
        return false;
}
return true;

Or:

// Could use a narrowing cast comparison, but this avoids any warnings
for(wchar_t& wc: ws) {
    if(wc > std::numeric_limits<char>::max())
        return false;
}
return true;

FWIW, in Win32, there are conversion routines that accept a parameter of WC_ERR_INVALID_CHARS that tells the routine to fail instead of silently dropping code points. Non-standard solutions, of course.

Example: WideCharToMultiByte()

http://msdn.microsoft.com/en-us/library/windows/desktop/dd374130(v=vs.85).aspx

Community
  • 1
  • 1
codenheim
  • 20,467
  • 1
  • 59
  • 80
  • Perhaps you could help me here; I encounter an issue when the `wstring` variable has `\0` char, the conversion stops. so the destination `string` may, in such cases, be shorter then the source `wstring`. i'm not using your impl. since I get `exception: std::range_error at memory location 0x0021F044.` when running it. instead I use: `mbstowcs_s`. Any suggestions? – idanshmu Oct 26 '14 at 10:34
  • 1
    Which method gave you error, and with what input string? – codenheim Oct 26 '14 at 13:36