0

I need to convert file names from System.String into std::string. I am using both Japanese and English file names.

For English file names, there is no issue.

Only Japanese file names are not converting to std::string in English Windows 10.

I used WideCharToMultiByte() and code page 932.

std::string marshal_as(System::String^ str)
{
    std::string convertedstring;
    size_t _size = 0;
    cli::pin_ptr<const wchar_t> _pinned_ptr = PtrToStringChars(str);
    _size = WideCharToMultiByte(932, 0, _pinned_ptr, str->Length, 0, 0, 0, 0);
    if (_size > 0)
    {
        convertedstring.resize(_size);
        char* buffer = &convertedstring[0];
        WideCharToMultiByte(932, 0, _pinned_ptr, -1, &buffer[0], _size, 0, 0);

    }

    return convertedstring;
}

Here str is "C:\\files\\ブ種別.pdf"

convertedstring is "C:\\files\\ƒuŽí•Ê.pdf"

Can anyone help me to resolve this?

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
Saya
  • 9
  • 3
  • Please post your code, see [mre]. – Paul Sanders Jun 16 '20 at 15:25
  • posted the code which i used – Saya Jun 16 '20 at 15:43
  • Do you have Windows or IBM? https://en.wikipedia.org/wiki/Code_page_932_(Microsoft_Windows) – jdweng Jun 16 '20 at 16:02
  • Please be more descriptive of what "not converting" means. Are they removed from the output? Are they copied unchanged? Are they converted to U+FFFD (REPLACEMENT CHARACTER)? Does WideCharToMultiByte report zero characters converted? – Raymond Chen Jun 16 '20 at 16:03
  • @jdweng I am using Windows – Saya Jun 16 '20 at 16:27
  • @RaymondChen sorry.. now I added the japanese string and converted string – Saya Jun 16 '20 at 16:28
  • Do you terminate the string with '\0'. What error are you getting? – jdweng Jun 16 '20 at 16:45
  • I am not getting any specific error. Instead of japanese string i am getting junk characters. – Saya Jun 16 '20 at 16:47
  • 1
    How are you using the returned string? Is the code that uses it [aware](https://stackoverflow.com/a/1010785/11683) that it's in codepage 932? – GSerg Jun 16 '20 at 19:03
  • 3
    The string is converted correctly. The character ブ is represented by the bytes 0x83 0x75 in code page 932. The problem is that you are printing it in code page 1252, not 932. In code page 1252, the byte 0x83 is the character "ƒ" and the byte 0x75 is the character "u". (Also, I removed the C# tag from this question seeing as there's no C# in it.) – Raymond Chen Jun 16 '20 at 23:46
  • 1
    If you convert a string to code page 932, then the only place it makes sense is in code page 932. – Raymond Chen Jun 17 '20 at 02:40
  • is it not possible to do in English OS? – Saya Jun 17 '20 at 02:43
  • @GSerg returned string will be passed to cpp library – Saya Jun 17 '20 at 03:09
  • @Saya You are having the **exact same issue** that [you asked about yesterday with different code](https://stackoverflow.com/questions/62413509/). You are *converting* your Japanese strings to `std::string` correctly, but whatever is *using* those converted strings afterwards is not *interpreting* the strings correctly, so you are not getting the results you want. Your conversions themselves are fine. – Remy Lebeau Jun 17 '20 at 21:05
  • thank you for your help.Now I understood what is happening. – Saya Jun 18 '20 at 04:26

1 Answers1

0

My issue is fixed.I enabled Beta: Use Unicode UTF-8 for worldwide language support option in region settings in control panel.

Saya
  • 9
  • 3