3

Most WinAPI calls have Unicode and ANSI function call

For example:

function MessageBoxA(hWnd: HWND; lpText, lpCaption: LPCSTR; uType: UINT): Integer; stdcall;external user32;

function MessageBoxW(hWnd: HWND; lpText, lpCaption: LPCWSTR; uType: UINT): Integer; stdcall; external user32;

When should I use the ANSI function rather than calling the Unicode function?

Daniel LB
  • 25
  • 6
RepeatUntil
  • 2,272
  • 4
  • 32
  • 57
  • 1
    That's a slight exaggeration. If you are dealing with a narrow string encoded in ASCII - for example, because you've read it from a network socket or plain text file - there is no point in manually converting it to UTF-16 just so you can call MessageBoxW instead of MessageBoxA. (Of course you have to know what encoding you're using. If the narrow string is actually in UTF-8 then you should indeed convert it to UTF-16 and call MessageBoxW.) – Harry Johnston Nov 15 '15 at 23:29

2 Answers2

11

Just as (rare) exceptions to the posted comments/answers...

One may choose to use the ANSI calls in cases where UTF-8 is expected and supported. For an example, WriteConsoleA'ing UTF-8 strings in a console set to use a TT font and running under chcp 65001.

Another oddball exception is functions that are primarily implemented as ANSI, where the Unicode "W" variant simply converts to a narrow string in the active codepage and calls the "A" counterpart. For such a function, and when a narrow string is available, calling the "A" variant directly saves a redundant double conversion. Case in point is OutputDebugString, which fell into this category until Windows 10 (I just noticed https://msdn.microsoft.com/en-us/library/windows/desktop/aa363362.aspx which mentions that a call to WaitForDebugEventEx - only available since Windows 10 - enables true Unicode output for OutputDebugStringW).

Then there are APIs which, even though dealing with strings, are natively ANSI. For example GetProcAddress only exists in the ANSI variant which takes a LPCSTR argument, since names in the export tables are narrow strings.

That said, by an large most string-related APIs are natively Unicode and one is encouraged use the "W" variants. Not all the newer APIs even have an "A" variant any longer (e.g. CommandLineToArgvW). From the horses's mouth https://msdn.microsoft.com/en-us/library/windows/desktop/ff381407.aspx:

Windows natively supports Unicode strings for UI elements, file names, and so forth. Unicode is the preferred character encoding, because it supports all character sets and languages. Windows represents Unicode characters using UTF-16 encoding, in which each character is encoded as a 16-bit value. UTF-16 characters are called wide characters, to distinguish them from 8-bit ANSI characters.

[...] When Microsoft introduced Unicode support to Windows, it eased the transition by providing two parallel sets of APIs, one for ANSI strings and the other for Unicode strings.

[...] Internally, the ANSI version translates the string to Unicode. The Windows headers also define a macro that resolves to the Unicode version when the preprocessor symbol UNICODE is defined or the ANSI version otherwise.

[...] Most newer APIs in Windows have just a Unicode version, with no corresponding ANSI version.

[ NOTE ]  The post was edited to add the last two paragraphs.

IInspectable
  • 46,945
  • 8
  • 85
  • 181
dxiv
  • 16,984
  • 2
  • 27
  • 49
  • 2
    This is actually a better answer than the accepted answer. While both answers are true, this one does address the question better. You may want to add, that all supported versions of Windows (i.e. Windows NT based systems) use UTF-16LE internally, and convert from/to ANSI encoding, when using the ANSI versions of the API calls. That way your answer is self-contained, and remains useful, even when other answers/comments do get deleted. – IInspectable Nov 15 '15 at 14:32
  • @IInspectable my intention was to point some exceptions since the rule was fairly obvious. But you are right and I just edited my answer to that effect. – dxiv Nov 15 '15 at 17:51
  • @IInspectable thanks for the edit. CommandLineToArgvW is a good example of a W-only API, and in fact dates back to at least NT 3.51. – dxiv Nov 15 '15 at 20:21
5

The simplest rule to follow is this: Only use the ANSI variants on systems that do not have the Unicode variant. That is on Windows 95, 98 and ME, which are the versions of Windows that do not support Unicode.

These days, it is exceptionally unlikely that you will be targeting such versions, and so in all probability you should always just use the Unicode variants.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • 2
    Should be noted that Win9x had in fact very limited support for Unicode (https://support.microsoft.com/en-us/kb/210341). Also that there was a "Microsoft Layer for Unicode on Windows 95/98/ME" released later (https://msdn.microsoft.com/en-us/goglobal/bb688166.aspx) which expanded that support somewhat. – dxiv Nov 15 '15 at 17:38