Issue in Converting wchar_t* to char*

Question

I need to read the current directory in Windows 7 which is in a different locale than the one that is currently used. So I thought of using GetCurrentDirectoryW() since it is unicode compatible, with wchar_t*. However, I need to use an existing API, so I need to convert this to char*. For this purpose I used the wcstombs() function. However, the conversion is not happening properly. Included below is the code that I used:

    wchar_t w_currentDir[MAX_PATH + 100];
    char currentDir[MAX_PATH + 100];
    GetCurrentDirectoryW(MAX_PATH, w_currentDir);
    wcstombs (currentDir, w_currentDir, MAX_PATH + 100);
    printf("%s \n", currentDir);

The current directory that I'm in is C:\特斯塔敌人. When the conversion is done, Only the 'C:\' part of the full path is converted to char* properly. The other characters are not, they are junk values. What is the problem in this approach that I'm using? How can I rectify this?

Thank you!

score 1 · Answer 1 · answered Nov 19 '12 at 04:40

1

The problem is that there is no appropriate conversion possible. A wide character may not have a regular char equivalent (which is why wchar exists in the first place. So you should be using wprintf:

GetCurrentDirectoryW(MAX_PATH, w_currentDir);
wprintf("%s \n", w_currentDir);

answered Nov 19 '12 at 04:40

Sidharth Mudgal

4,234
19
25

Thanks. Does that mean that we can never use the wcstombs function? Else is this specific for non-locale characters? – Izza Nov 19 '12 at 04:42
If your `wchar` string contains non-ASCII characters, conversion is not possible. – Sidharth Mudgal Nov 19 '12 at 04:43
Ok, that means this current approach is not working since the path has come chinese characters ryt? The issue I'm facing is that I have no control over the API, and it required the current path to be char*. If I use GetCurrentDirectory anyway the non-locales will get mangled. Any possible workaround existing? – Izza Nov 19 '12 at 04:52
That's not actually why wchar_t exists. wchar_t isn't required to be able to represent anything that can't be represented in char in the same locale. wchar_t isn't even specified to use the same encoding in different locales. wchar_t actually was intended to be a 1:1 character to code unit representation, so that text processing algorithms could be implemented simply instead of forcing programmers to work directly on multi-byte representations. [Here's](http://stackoverflow.com/questions/11107608/whats-wrong-with-c-wchar-t-and-wstrings-what-are-some-alternatives-to-wide) aquestion on the topic – bames53 Nov 19 '12 at 04:53
@bames53 just curious, so why does `wchar_t` use 2 bytes while char uses 1(usually)? – Sidharth Mudgal Nov 19 '12 at 04:56
Because `char` can use multi-byte encodings where multiple `char`s represent a character. A character that takes multiple `char` to represent must have a representation as a single `wchar_t` as well. – bames53 Nov 19 '12 at 05:00
Ok, thanks a lot for the information. So this means there is no way this conversion can be done ryt? – Izza Nov 19 '12 at 05:03
Apparently its an [issue with `printf`](http://pubs.opengroup.org/onlinepubs/7908799/xcu/printf.html). So `wcstombs` works fine but you can't print it... – Sidharth Mudgal Nov 19 '12 at 05:12
no, if I use a breakpoint after the conversion has happened, I still have the junk values for the characters. – Izza Nov 19 '12 at 05:14
Mmm, didn't get your question exactly. But before the conversion, w_currentDir shows the path correctly with the non-locale (Chinese) characters corretly. After the conversion, currentDir which is the multibyte buffer holds junk values after "C:\", i.e. "C:\". – Izza Nov 19 '12 at 05:40

Issue in Converting wchar_t* to char*

1 Answers1

Linked