5

This may be a basic question, but I don't get it quite for some reasons.

The question is: What actually the "Character Set" property in Visual Studio is? (e.g., the one which you can set to either: Use Unicode Character Set, Use Multi-Byte Character Set, Not Set - in properties)

I know more or less what a Unicode is, but why do we need to set this property?

For example if I don't set it and use L"hello" type of strings in project, it will not make sense?

Daniel Daranas
  • 22,454
  • 9
  • 63
  • 116

3 Answers3

8

Setting the Character Set option in Visual Studio will define few preprocessor symbols for you:

  • Unicode will define _UNICODE
  • Use Multi-byte Character Set will define _MBCS
  • Not Set will not define any of these.

Now, if you look into some header file from SDK, you will see bunch of these:

#ifdef _UNICODE
#define GetDeltaInfo                        GetDeltaInfoW
#else
#define GetDeltaInfo                        GetDeltaInfoA
#endif /* _UNICODE */

Where W and A functions are:

BOOL
WINAPI
GetDeltaInfoA(
    __in LPCSTR lpDeltaName,
    __out LPDELTA_HEADER_INFO lpHeaderInfo
    );

/**
 * Gets header information for a delta accessed by Unicode file name.
 * @param lpDeltaName   Delta file name, Unicode.
 * @param lpHeaderInfo  Header information for given Delta.
 * @return              TRUE if success, FALSE otherwise.
 */
BOOL
WINAPI
GetDeltaInfoW(
    __in LPCWSTR lpDeltaName,
    __out LPDELTA_HEADER_INFO lpHeaderInfo
    );

So, by setting the Unicode or Multibyte, you will select the right set of the functions.

Nemanja Boric
  • 21,627
  • 6
  • 67
  • 91
  • aa, this sounds interesting. But is that all it does? and actually I could have called the right function directly myself too right? e.g., `GetDeltaInfoW` - in case of Unicode. So it helps to select the right function if inside my code I am calling the function using such name: `GetDeltaInfo`?? –  Sep 23 '13 at 12:55
  • Exactly, there is no need to call the specific version of the function, so you can just call `GetDeltaInfo`. (Note that there is no need to call ANSI version at all these days, as the comments say). – Nemanja Boric Sep 23 '13 at 12:56
  • 1
    @dmcr_code: all the A/W and TCHAR stuff comes from the times when Windows NT supported Unicode, but Windows 9x didn't. So, with TCHARs and stuff you could compile from a single set of sources two binaries, one for Windows 9x (that used old ANSI functions) and one for Windows NT. Since the current CRT doesn't even support anything before Windows XP, the whole thing is mostly useless nowadays; currently, you should just enable Unicode to avoid all the ugly "W"s at the end function names. – Matteo Italia Sep 23 '13 at 13:00
  • One question again: So is this *all* this setting of character encoding does? The ability that by calling `GetDeltaInfo` a correct version of it will be called (namely either GetDeltaInfoA or GetDeltaInfoW). Is that all? –  Oct 15 '13 at 13:31
4

When you do e.g. L"Hello" you create a wide character string. To use it you have to use std::wstring, or wchar_t for single characters.

In Visual Studio don't use e.g. L"Hello" directly, instead use the T macro like T("Hello") which will do the right thing depending on your "Character set" settings. You should also use TCHAR instead of char or wchar_t. There are no C++ standard string type though, since it's a Visual Studio specific extension, but you can to e.g.

typedef std::basic_string<TCHAR> tstring;

When you set the Unicode character set, the compiler will #define the macro _UNICODE and use wchar_t and wide character strings and character. When the multibyte character is selected, then _UNICODE will not be defined, instead _MBCS will be, and TCHAR will be a normal char. If you select to not use either, then none of the macros will be defined, and normal char will be used.

See e.g. this link for more information.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • 6
    I don't agree with your advice to use `TCHAR`. Unless you need to support Win 98, you should forget all about the ANSI Win32 API functions. That's what the character set is all about, interacting with Win32. These days, unless you still target Win 98, then you can simply define `_UNICODE`, use `wchar_t` and prefix your literals with `L`. Writing new code that is character set agnostic results in more complex code for no gain. – David Heffernan Sep 23 '13 at 12:54
  • ok, so it means the `Character Set` property makes sense when used together with TCHARs? –  Sep 23 '13 at 12:57
1

The c runtime in the Microsoft flavour, and the headers of the Windows API define several macros and typedefs that evaluate to the multibyte or the wide char variant depending on this setting. For example, in

int _tmain(int argc, _TCHAR* argv[]);

the _TCHAR is defined like this (simplified):

#ifdef  _UNICODE
typedef wchar_t     _TCHAR;
#else
typedef char     _TCHAR;
#endif

This way, the same code can be used for multibyte and unicode builds. _UNICODE is define when you choose "Use Unicode Character Set", it is not defined if you choose "Multi-Byte".

cdoubleplusgood
  • 1,309
  • 1
  • 11
  • 12