Standard C Convert ANSI to UNICODE

Question

I'm learning C programming, and in the process trying to create a Windows GUI app.

I've noticed at MSDN documentation to new applications should use UNICODE encoding. With that in mind, I added the -DUNICODE flag to the compiler.

When calling the function CreateWindowEx() to create the window, the title gets all weird when I use a char*, and the compiler gives me an error saying the function is expecting unsigned short*.

How can I convert ANSI string to unsigned short?

You could just call `CreateWindowExA`, which is the ASCII version of that function, and let Windows itself handle any conversions that need to happen. — David Grayson, Apr 20 '23 at 17:37
There's no such thing as an "ANSI string". Did you mean "ASCII string"? — Barmar, Apr 20 '23 at 17:46
@Barmar no they didn't. ASCII is a character set. Windows has the concept of the [ANSI code page](https://learn.microsoft.com/en-us/windows/win32/intl/code-page-identifiers) the reason they recommend using `wchar_t` instead of `char` is that the old "A" or "ansi" functions convert the strings internally and may have odd behavior — Mgetz, Apr 20 '23 at 17:53
[related](https://stackoverflow.com/q/26567820/332733) information about the different types — Mgetz, Apr 20 '23 at 17:55
@DavidGrayson There is no *"ASCII"* version of the Win32 API. You must be thinking of a different library. — IInspectable, Apr 20 '23 at 18:25
@Barmar, Re "*There's no such thing as an "ANSI string".*", Yes there is. It means a string encoded according to the process's ANSI Code Page aka Active Code Page. This is typically 1252 on US machines. — ikegami, Apr 20 '23 at 18:33
@ikegami As Mgetz already corrected me. I didn't know that Windows-specific concept. — Barmar, Apr 20 '23 at 18:34
Huh @IInspectable ? I'm not trying to make some complicated statement, just saying "CreateWindowExA" is a version of the "CreateWindowEx" function that takes ASCII input, or "ASCII version" for short. — David Grayson, Apr 20 '23 at 20:14
@DavidGrayson that is incorrect. The `A` functions take **ANSI** strings, not **ASCII** strings. Two very different things. ASCII is a *subset* of ANSI (or more accurately, most ANSI encodings including ASCII characters), but ANSI encodings have other characters that are not found in ASCII at all. Different machines can be configured to run with different ANSI codepages as their native encodings. — Remy Lebeau, Apr 20 '23 at 20:40

Remy Lebeau · Accepted Answer · 2023-04-20T20:46:00.850

3

CreateWindowEx() is a preprocessor macro. When UNICODE is defined, it is an alias for CreateWindowExW(), which takes wchar_t*¹ strings. When UNICODE is not defined, it is an alias for CreateWindowExA() instead, which takes char* strings.

// in winuser.h...
#ifdef UNICODE
#define CreateWindowEx  CreateWindowExW
#else
#define CreateWindowEx  CreateWindowExA
#endif // !UNICODE

¹ In Visual Studio, wchar_t is an alias for unsigned short in C, or if the /Zc:wchar_t compiler option is turned off in C++.

So, you can either:

turn off UNICODE and call CreateWindowEx(), or call CreateWindowExA() directly, either way passing in your char* strings.
if you are passing string literals into CreateWindowExW() (whether called directly, or when calling CreateWindowEx() with UNICODE enabled), prepend the literals with the special L prefix to make them wchar_t* strings, eg: L"...". When passing string literals into the CreateWindowEx() macro, you should wrap them in the TEXT() macro instead, eg: TEXT("..."), which will prepend them with the L prefix when UNICODE is defined.
use the MultiByteToWideChar() (or equivalent) function at runtime.

edited Apr 20 '23 at 20:46

answered Apr 20 '23 at 17:57

Remy Lebeau

555,201
31
458
770

1

What on earth would be the point of deciding to code for Unicode and then calling CreateWindowExA? I never understand why people would suggest that. – David Heffernan Apr 20 '23 at 19:55
@DavidHeffernan just because the Win32 API gets used in Unicode mode doesn't guarantee the rest of the user's code is Unicode-enabled. Sometimes `char` strings have to be mixed with `wchar_t` APIs, and vice versa, hence the suggestion that users should call the `A` or `W` API that matches the encoding of the data being used for each call. Obviously, when a user starts a new project, they should pick an encoding and be consistent with it. But when dealing with legacy code, that is not always possible. – Remy Lebeau Apr 20 '23 at 20:47
1

did you read the question title? I don't understand your stance here at all. Unless of course the idea is to use UTF8. – David Heffernan Apr 20 '23 at 21:07

Standard C Convert ANSI to UNICODE

1 Answers1