how to convert const WCHAR * to const char *

Question

CString output ;
const WCHAR* wc = L"Hellow World" ;
if( wc != NULL )
{   
     output.Append(wc);
}
printf( "output: %s\n",output.GetBuffer(0) );

You don't need GetBuffer. CString has a LPCTSTR operator which accesses the internal buffer. — MikMik, Sep 28 '12 at 10:22
what should be the output if `wc` is `привет мир`? do you care about code pages or this is just wide -> narrow conversion with all wide characters being ANSI characters? — Zdeslav Vojkovic, Sep 28 '12 at 10:55

Zdeslav Vojkovic · Accepted Answer · 2016-04-04T12:36:04.773

35

you can also try this:

#include <comdef.h>  // you will need this
const WCHAR* wc = L"Hello World" ;
_bstr_t b(wc);
const char* c = b;
printf("Output: %s\n", c);

_bstr_t implements following conversion operators, which I find quite handy:

operator const wchar_t*( ) const throw( ); 
operator wchar_t*( ) const throw( ); 
operator const char*( ) const; 
operator char*( ) const;

EDIT: clarification with regard to answer comments: line const char* c = b; results in a narrow character copy of the string being created and managed by the _bstr_t instance which will release it once when it is destroyed. The operator just returns a pointer to this copy. Therefore, there is no need to copy this string. Besides, in the question, CString::GetBuffer returns LPTSTR (i.e. TCHAR*) and not LPCTSTR (i.e. const TCHAR*).

Another option is to use conversion macros:

USES_CONVERSION;
const WCHAR* wc = L"Hello World" ;
const char* c = W2A(wc);

The problem with this approach is that the memory for converted string is allocated on stack, so the length of the string is limited. However, this family of conversion macros allow you to select the code page which is to be used for the conversion, which is often needed if wide string contains non-ANSI characters.

edited Apr 04 '16 at 12:36

answered Sep 28 '12 at 10:10

Zdeslav Vojkovic

14,391
32
45

I'm so tempted to +1 this. `_bstr_t` and `_variant_t` used to be my best friends back in the days when you really needed ATL to do a decent COM component in C++ – sehe Sep 28 '12 at 10:13
why would it copy it? your code shows just that you need to use it in `printf`. `_bstr_t` will take care of releasing the memory. If you need to keep a copy and send the string around, use the `_bstr_t` instance, not `const char*` - in this sense, `_bstr_t` is similar to `CString`. It takes care of copying the string data properly when multiple copies of the object are used (although it doesn't use *copy-on-write*). – Zdeslav Vojkovic Sep 28 '12 at 10:30
const WCHAR* wc = L"Hellow World" ; c = _bstr_t(wc);printf( "output: %s\n",c ); – jack Sep 28 '12 at 10:36
output is like this: ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ε■ – jack Sep 28 '12 at 10:58
is your _bstr_t object still alive at that moment? According to your code sample it is. – Zdeslav Vojkovic Sep 28 '12 at 10:59
I have tried exaclty the same code as in my answer, and it works. – Zdeslav Vojkovic Sep 28 '12 at 11:01
ooops, sorry, wrote to quickly! see the updated code sample. You were completely right: the original code created a temporary `bstr_t` which was destroyed immediately after assignment to `c`. I was typing faster than I was thinking, sorry for confusion. – Zdeslav Vojkovic Sep 28 '12 at 11:03

score 13 · Answer 2 · answered Apr 04 '14 at 11:49

13

You can use sprintf for this purpose:

const char output[256];
const WCHAR* wc = L"Hellow World" ;
sprintf(output, "%ws", wc );

answered Apr 04 '14 at 11:49

l0pan

476
7
11

10

I don't think you can declare `output` as `const` – CinCout Jun 15 '16 at 04:54

score 6 · Answer 3 · answered Jun 05 '17 at 18:48

My code for Linux

// Debian GNU/Linux 8 "Jessie" (amd64)

#include <locale.h>
#include <stdlib.h>
#include <stdio.h>

// Use wcstombs(3) to convert Unicode-string (wchar_t *) to UTF-8 (char *)
// http://man7.org/linux/man-pages/man3/wcstombs.3.html

int f(const wchar_t *wcs) {
        setlocale(LC_ALL,"ru_RU.UTF-8");
        printf("Sizeof wchar_t: %d\n", sizeof(wchar_t));
        // on Windows, UTF-16 is internal Unicode encoding (UCS2 before WinXP)
        // on Linux, UCS4 is internal Unicode encoding
        for (int i = 0; wcs[i] > 0; i++) printf("%2d %08X\n",i,wcs[i]);
        char s[256];
        size_t len = wcstombs(s,wcs,sizeof(s));
        if (len > 0) {
                s[len] = '\0';
                printf("mbs: %s\n",s);
                for (int i = 0; i < len; i++)
                        printf("%2d %02X\n",i,(unsigned char)s[i]);
                printf("Size of mbs, in bytes: %d\n",len);
                return 0;
        }
        else return -1;
}

int main() {
        f(L"Привет"); // 6 symbols
        return 0;
}

How to build

#!/bin/sh
NAME=`basename $0 .sh`
CC=/usr/bin/g++-4.9
INCS="-I."
LIBS="-L."
$CC ${NAME}.c -o _${NAME} $INCS $LIBS

Output

$ ./_test 
Sizeof wchar_t: 4
 0 0000041F
 1 00000440
 2 00000438
 3 00000432
 4 00000435
 5 00000442
mbs: Привет
 0 D0
 1 9F
 2 D1
 3 80
 4 D0
 5 B8
 6 D0
 7 B2
 8 D0
 9 B5
10 D1
11 82
Size of mbs, in bytes: 12

Luchian Grigore · Answer 4 · 2012-09-28T10:46:22.613

1

You could do this, or you could do something cleaner:

std::wcout << L"output: " << output.GetString() << std::endl;

edited Sep 28 '12 at 10:46

answered Sep 28 '12 at 10:00

Luchian Grigore

253,575
64
457
625

1

Why to use `GetBuffer()`? Here is `GetString()` official C-string getter! – Rost Sep 28 '12 at 10:43
@Rost copy-paste :D No need to yell :D – Luchian Grigore Sep 28 '12 at 10:46
3

Copy-paste is evil!!! Real developers always retype char by char! Don't you know?!? :-D – Rost Sep 28 '12 at 10:50

score 1 · Answer 5 · answered Sep 28 '12 at 10:13

1

It's quite easy, because CString is just a typedef for CStringT, and you also have access to CStringA and CStringW (you should read the documentation about the differences).

CStringW myString = L"Hello World";
CString myConvertedString = myString;

answered Sep 28 '12 at 10:13

Mark Ingram

71,849
51
176
230

Yes, I realise that, but it was written that way to be closer to his example code. – Mark Ingram Sep 28 '12 at 12:32
What does this conversion do with wide chars that don't have a matching narrow char? – M.M Apr 04 '14 at 11:51

Benjamin Buch · Answer 6 · 2021-11-30T16:16:51.643

You can use the std::wcsrtombs function.

Here is a C++17 overload set for conversion:

#include <iostream> // not required for the conversion function

// required for conversion
#include <cuchar>
#include <cwchar>
#include <stdexcept>
#include <string>
#include <string_view> // for std::wstring_view overload

std::string to_string(wchar_t const* wcstr){
    auto s = std::mbstate_t();
    auto const target_char_count = std::wcsrtombs(nullptr, &wcstr, 0, &s);
    if(target_char_count == static_cast<std::size_t>(-1)){
        throw std::logic_error("Illegal byte sequence");
    }

    // +1 because std::string adds a null terminator which isn't part of size
    auto str = std::string(target_char_count, '\0');
    std::wcsrtombs(str.data(), &wcstr, str.size() + 1, &s);
    return str;
}

std::string to_string(std::wstring const& wstr){
    return to_string(wstr.c_str());
}

std::string to_string(std::wstring_view const& view){
    // wstring because wstring_view is not required to be null-terminated!
    return to_string(std::wstring(view));
}

int main(){
    using namespace std::literals;

    std::cout
        << to_string(L"wchar_t const*") << "\n"
        << to_string(L"std::wstring"s) << "\n"
        << to_string(L"std::wstring_view"sv) << "\n";
}

If you use Pre-C++17, you should urgently update your compiler! ;-)

If this is really not possible, here is a C++11 version:

#include <iostream> // not required for the conversion function

// required for conversion
#include <cwchar>
#include <stdexcept>
#include <string>

std::string to_string(wchar_t const* wcstr){
    auto s = std::mbstate_t();
    auto const target_char_count = std::wcsrtombs(nullptr, &wcstr, 0, &s);
    if(target_char_count == static_cast<std::size_t>(-1)){
        throw std::logic_error("Illegal byte sequence");
    }

    // +1 because std::string adds a null terminator which isn't part of size
    auto str = std::string(target_char_count, '\0');
    std::wcsrtombs(const_cast<char*>(str.data()), &wcstr, str.size() + 1, &s);
    return str;
}

std::string to_string(std::wstring const& wstr){
    return to_string(wstr.c_str());
}

int main(){
    std::cout
        << to_string(L"wchar_t const*") << "\n"
        << to_string(std::wstring(L"std::wstring")) << "\n";
}

score 0 · Answer 7 · answered Feb 03 '21 at 07:12

You can use sprintf for this purpose, as @l0pan mentions (but I used %ls instead of %ws):

char output[256];
const WCHAR* wc = L"Hello World" ;
sprintf(output, "%ws", wc ); // did not work for me (Windows, C++ Builder)
sprintf(output, "%ls", wc ); // works

how to convert const WCHAR * to const char *

7 Answers7

Linked