0

I am aware of the post: Converting managed System::String to std::string in C++/CLI for the required conversion. But I came across the following code which uses marshal_context instead. I am trying to understand how it works.

// required header : #include <msclr/marshal.h>
System::String^ str = gcnew System::String(L"\u0105");
msclr::interop::marshal_context ctx;
auto constChars = ctx.marshal_as<const char*>(str);
std::string myString(constChars);

If I am not wrong str is a single "character" represented by 16 bits using UTF-16, which according to the Unicode list is a small Latin letter a with an ogonek. But myString comes out to be a single character ?. How does this conversion happen?

Moreover why does code work as "expected" when creating str with a an ASCII character say a. In UTF-16 a would be represented in 16 bits, with most/least (depending on endianess) significant 8 bits being all 0. Why does then myString have only one char a?

advocateofnone
  • 2,527
  • 3
  • 17
  • 39

1 Answers1

1

A std::string is a sequence of chars. A char can typically only hold ascii characters (in 8 bit). It can overflow when assigned a unicode character value that can exceed 8 bits. When it overflows you get a "garbaged" value.

You need std::wstring, which contains a sequence of wchat_t to represent a unicode string.

Therefore change your last 2 lines to:

//-------------------------------------vvvvvvv--------
auto constChars = ctx.marshal_as<const wchar_t*>(str);

//---vvvvvvv----------------------
std::wstring myString(constChars);
wohlstad
  • 12,661
  • 10
  • 26
  • 39