36

I want to convert a normal string to a wstring. For this, I am trying to use the Windows API function MultiByteToWideChar. But it does not work for me.

Here is what I have done:

string x = "This is c++ not java";
wstring Wstring;
MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , x.size() , &Wstring , 0 ); 

The last line produces the compiler error:

'MultiByteToWideChar' : cannot convert parameter 5 from 'std::wstring *' to 'LPWSTR'

How do I fix this error?

Also, what should be the value of the argument cchWideChar? Is 0 okay?

MultiplyByZer0
  • 6,302
  • 3
  • 32
  • 48
Suhail Gupta
  • 22,386
  • 64
  • 200
  • 328

5 Answers5

60

You must call MultiByteToWideChar twice:

  1. The first call to MultiByteToWideChar is used to find the buffer size you need for the wide string. Look at Microsoft's documentation; it states:

    If the function succeeds and cchWideChar is 0, the return value is the required size, in characters, for the buffer indicated by lpWideCharStr.

    Thus, to make MultiByteToWideChar give you the required size, pass 0 as the value of the last parameter, cchWideChar. You should also pass NULL as the one before it, lpWideCharStr.

  2. Obtain a non-const buffer large enough to accommodate the wide string, using the buffer size from the previous step. Pass this buffer to another call to MultiByteToWideChar. And this time, the last argument should be the actual size of the buffer, not 0.

A sketchy example:

int wchars_num = MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , -1, NULL , 0 );
wchar_t* wstr = new wchar_t[wchars_num];
MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , -1, wstr , wchars_num );
// do whatever with wstr
delete[] wstr;

Also, note the use of -1 as the cbMultiByte argument. This will make the resulting string null-terminated, saving you from dealing with them.

MultiplyByZer0
  • 6,302
  • 3
  • 32
  • 48
Eran
  • 21,632
  • 6
  • 56
  • 89
  • 4
    +1 for accentuating the need for calling the MultiByteToWideChar twice, which is essential for the charset conversion functions. – Stephan Jul 14 '11 at 12:32
  • @ eran what is difference between `wchar_t*` and `LPTSTR` ? – Suhail Gupta Jul 14 '11 at 12:47
  • @Suhail Gupta, if you're compiling with Unicode, then it's exactly the same. In multi-byte build, LPTSTR would expand to a regular `char*`. Using those macros allows you to create both Unicode and non-Unicode builds. I can't think of a reason to do that these days, though, and since Unicode is now the default in VS, use either one of them. – Eran Jul 14 '11 at 12:53
  • OWCH! There is no such thing as `free[]`, and even if there was, I would never condone such code. Use a `std::vector` appropriately resized. – Puppy Jul 14 '11 at 13:39
  • @DeadMG Owch indeed... that's why I stated it as sketchy. Was in a hurry. Fixed answer, thanks. – Eran Jul 14 '11 at 13:43
  • 1
    @eran: The owch part of that wasn't the `free[]`, it was the "PLEASE LEAK RESOURCES AND OVERFLOW MY BUFFERS AND CORRUPT MY HEAP" of using `new` and `delete` directly. Use `std::vector`. – Puppy Jul 14 '11 at 14:15
  • @bbqchickenrobot I tried simplified Chinese (GB18030) and it works as expected. Did you change CP_UTF8 to Chinese codepage? – Rick Jun 06 '18 at 04:30
  • @Rick it might work now - but that was a while ago I posted that. Glad it works! ;) – bbqchickenrobot Jul 24 '18 at 18:24
  • `return value is the required size, in characters`. If the function indicates that my result will have 5 characters, then would I need to allocate 11 16-bit charts to be able to accept 5 Utf-16 chars. – Pavel P Apr 01 '20 at 07:26
12

Few common conversions:

#define WIN32_LEAN_AND_MEAN

#include <Windows.h>

#include <string>

std::string ConvertWideToANSI(const std::wstring& wstr)
{
    int count = WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), wstr.length(), NULL, 0, NULL, NULL);
    std::string str(count, 0);
    WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), -1, &str[0], count, NULL, NULL);
    return str;
}

std::wstring ConvertAnsiToWide(const std::string& str)
{
    int count = MultiByteToWideChar(CP_ACP, 0, str.c_str(), str.length(), NULL, 0);
    std::wstring wstr(count, 0);
    MultiByteToWideChar(CP_ACP, 0, str.c_str(), str.length(), &wstr[0], count);
    return wstr;
}

std::string ConvertWideToUtf8(const std::wstring& wstr)
{
    int count = WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), wstr.length(), NULL, 0, NULL, NULL);
    std::string str(count, 0);
    WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), -1, &str[0], count, NULL, NULL);
    return str;
}

std::wstring ConvertUtf8ToWide(const std::string& str)
{
    int count = MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), NULL, 0);
    std::wstring wstr(count, 0);
    MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), &wstr[0], count);
    return wstr;
}
stax76
  • 416
  • 4
  • 12
2

You can try this solution below. I tested, it works, detect special characters (example: º ä ç á ) and works on Windows XP, Windows 2000 with SP4 and later, Windows 7, 8, 8.1 and 10. Using std::wstring instead new wchar_t / delete, we reduce problems with leak resources, overflow buffer and corrupt heap.

dwFlags was set to MB_ERR_INVALID_CHARS to works on Windows 2000 with SP4 and later, Windows XP. If this flag is not set, the function silently drops illegal code points.

std::wstring ConvertStringToWstring(const std::string &str)
{
    if (str.empty())
    {
        return std::wstring();
    }
    int num_chars = MultiByteToWideChar(CP_ACP, MB_ERR_INVALID_CHARS, str.c_str(), str.length(), NULL, 0);
    std::wstring wstrTo;
    if (num_chars)
    {
        wstrTo.resize(num_chars);
        if (MultiByteToWideChar(CP_ACP, MB_ERR_INVALID_CHARS, str.c_str(), str.length(), &wstrTo[0], num_chars))
        {
            return wstrTo;
        }
    }
    return std::wstring();
}
Nix
  • 39
  • 3
1

Second question about this, this morning!

WideCharToMultiByte() and MultiByteToWideChar() are a pain to use. Each conversion requires two calls to the routines and you have to look after allocating/freeing memory and making sure the strings are correctly terminated. You need a wrapper!

I have a convenient C++ wrapper on my blog, here, which you are welcome to use.

Here's the other question this morning

Community
  • 1
  • 1
ravenspoint
  • 19,093
  • 6
  • 57
  • 103
-1

The function cannot take a pointer to a C++ string. It will expect a pointer to a buffer of wide characters of sufficient size- you must allocate this buffer yourself.

string x = "This is c++ not java";
wstring Wstring;
Wstring.resize(x.size());
int c =  MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , x.size() , &Wstring[0], 0 ); 
Puppy
  • 144,682
  • 38
  • 256
  • 465
  • 1
    `MultiByteToWideChar` expects a parameter of type `wchar_t*`. Wstring is of type `std::wstring` - so it cannot be passed to `MultiByteToWideChar` (not even a pointer to it). But the good news is, that `std::wstring` internally stores its data as `wchar_t*` and provides two function to get access to this internal data: `data()` (which is used here) and `c_str()`. – Stephan Jul 14 '11 at 12:38
  • 1
    @DeadMG, note that wstring.data() returns a const wchar_t*, which accoding to cplusplus.com should not be modified directly (you probably know better than me what would be the effect of doing so). OTOH, the last argument of MBTWC being 0, nothing will be placed in that buffer anyway... – Eran Jul 14 '11 at 12:44
  • @eran: Oops, you're totally right about the return value being `const`. – Puppy Jul 14 '11 at 13:37
  • It cannot work in that way, wstring uses 32bit chars, while win32 uses 16bit unicode chars... – gabry Jan 12 '17 at 13:25