2

Mapping format specifier %s to %ls when _tprintf() is mapped to wprintf()?

I am using the _T() macro for mapping strings to either ASCII or Unicode, depending on whether _UNICODE is defined.

However, a call like _tprintf("%s", _T("text string")) is causing me trouble, because of inconsistent types when _UNICODE is defined.

I see that %ls should be used for Unicode strings.

How can %s be mapped directly to %ls when _UNICODE is defined? Is there some fancy function like _T()?

Mr.C64
  • 41,637
  • 14
  • 86
  • 162
Shuzheng
  • 11,288
  • 20
  • 88
  • 186
  • http://stackoverflow.com/q/5669173/166389 says that `%s` should match the behaviour of `_T()` already, and gives the formula to override that behaviour. – TBBle Feb 02 '16 at 14:49

2 Answers2

3

However, a call like _tprintf("%s", _T("text string")) is causing me trouble, because of inconsistent types when _UNICODE is defined.

You should use the _T() decorator also for the first string literal (i.e. the format specifier string) of _tprintf():

// NOTE: _T("%s"), not just "%s"
//
_tprintf(_T("%s"), _T("text string"));

This is expanded in ANSI builds to:

printf("%s", "text string"); // %s maps to char* ANSI string

and in Unicode builds to:

wprintf(L"%s", L"text string"); // %s maps to wchar_t* Unicode string
Mr.C64
  • 41,637
  • 14
  • 86
  • 162
  • What about the %ls specifier then? Is %s enough for Unicode? – Shuzheng Feb 02 '16 at 16:48
  • @NicolasLykkeIversen: In Unicode builds, `%s` maps to `wchar_t*` Unicode strings. In ANSI builds, `%s` maps to `char*` ANSI strings. It's as simple as that. Or am I missing something from your question? – Mr.C64 Feb 02 '16 at 16:49
  • Then what are the specifiers %ls and %S used for? The %s specifier is just interpreted as Unicode in the context of wprint(), while it is being interpreted as ASCII in the context of printf()? Or is the compiler doing something special? – Shuzheng Feb 02 '16 at 16:58
  • Also, is _T() and _tprintf() Windows specific? – Shuzheng Feb 02 '16 at 16:59
  • @NicolasLykkeIversen: Yes, `_T()` and `_tprintf()` are Windows specific (as is the so called "TCHAR model" of which they are a part of). In modern C++/Win32 code, I'd suggest just using Unicode, and avoid the whole complexity of the TCHAR model (which is for back compatibility with obsolete ANSI builds). – Mr.C64 Feb 02 '16 at 17:02
  • @NicolasLykkeIversen: In Windows (MSVC): **`%s`** is mapped to the "natural" string type, i.e. to `char*` strings in printf and to `wchar_t*` strings in wprintf. **`%S`** is mapped to the "opposite" string type, i.e. `wchar_t*` in printf and `char*` in wprintf. **`%ls`** always maps to `wchar_t*` strings in both printf and wprintf. – Mr.C64 Feb 02 '16 at 17:16
  • @NicolasLykkeIversen: You're welcome. MSDN and several years of Win32/C++ experience :) – Mr.C64 Feb 02 '16 at 17:24
  • I know this is off-topic, but why is it called Win32 and not Win64? Almost every OS is 64-bit, and we don't write 32-bit software? – Shuzheng Feb 02 '16 at 19:47
  • @NicolasLykkeIversen: Historical reasons? :) – Mr.C64 Feb 02 '16 at 19:48
  • So it is actually 64-bit functionality in Win32, or is it both 64 and 32-bit? Thanks for your help! – Shuzheng Feb 02 '16 at 20:05
1

The solution is to not use _tprintf but to use std::wcout.

  1. wcout supports both ansi characters and wchar_t characters
  2. wcout is safer then XXXprintf because it "knows" what kind of parameters it should print (avoid fiascos like printf("%s",'a');)
  3. its portable while _tprintf is not
  4. it's polymorphic and can work with other streams (like fstream and such) , _tprintf is not.

the only cons of xxxcout is that it tends to bloat the executable a bit and is a bit slower from the printf family , but I really dought it will be any real con in your app.

ditch the printf like functions in favor of C++ streams.

David Haim
  • 25,446
  • 3
  • 44
  • 78
  • So wcout prints both ASCII and Unicode without problems? Why use cout then? Does wcout gets mapped to something like _tprintf() if _UNICODE is defined? – Shuzheng Feb 02 '16 at 14:54
  • @NicolasLykkeIversen: "*Why use cout then?*" Because much of the non-Microsoft world uses UTF-8, and their `wchar_t` is a full 4 bytes. – Nicol Bolas Feb 02 '16 at 14:55
  • 1
    `does wcout gets mapped to something like _tprintf() if _UNICODE is defined?` it uses overloading which is a C++ mechanism to accomplish general code (together with many others) – David Haim Feb 02 '16 at 14:55
  • So if I have an ASCII string and a Unicode string in the same C++ program, then wcout applied to those strings like wcout << produces the right output? I just saw some Stackoverflow answer, where a new variable tcout was defined as cout or wcout, depending on whether _UNICODE was defined. This is not neccessary, I see, because of wcout? – Shuzheng Feb 02 '16 at 14:58
  • exactly , this : `std::wcout<< "hello " << L"world"; ` works fine without any problems – David Haim Feb 02 '16 at 14:59
  • Also, are all the _T() and _tprintf() functionality only in place for those programming pure C? – Shuzheng Feb 02 '16 at 14:59
  • Yes! C++ is too awsome for this sh*t – David Haim Feb 02 '16 at 15:00