3

I have the following code:

#include <iostream>
#include <string>
#include <locale>
#include <codecvt>
using namespace std;


int main()
{
    std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;

    const char val[] = "+3°C";
    wstring text = converter.from_bytes(val);

    return 0;
}

The problem is that the method converter.from_bytes throws an exception. Why? How should I parse the given string?

The exception is of type std::range_error with the message

bad conversion

enter image description here

The problem is related to the character '°', since if I remove this character the conversion works fine.

Nick
  • 10,309
  • 21
  • 97
  • 201

1 Answers1

3

My guess would be that the string literal "+3°C" is not UTF-8 encoded because your IDE is using a different source character set.

You can only embed the character ° directly into the source code if the source file itself is UTF-8 encoded. If it's using some Windows codepage that represents ° differently then it probably embeds one or more bytes into the string which are not valid UTF-8 characters, so the conversion from UTF-8 to UTF-16 fails.

It works fine in a live demo such as http://coliru.stacked-crooked.com/a/23923c288ed5f9f3 because that runs on a different OS where the compiler assumes source files use UTF-8 by default (which is standard for GNU/Linux and other platforms with saner handling of non-ASCII text).

Try replacing it with a UTF-8 literal u8"+3\u2103" (using the universal character name for the DEGREES CELSIUS character) or u8"+3\u00B0C" (using the universal character name for the DEGREE SIGN character and then a capital C).

That tells the compiler that you want a string containing the UTF-8 representation of exactly those Unicode characters, independent of the encoding of the source file itself.

Jonathan Wakely
  • 166,810
  • 27
  • 341
  • 521