34

I have tried the following code:

wprintf(L"1 %s\n","some string"); //Good
wprintf(L"2 %s\n",L"some string"); //Not good -> print only first character of the string
printf("3 %s\n","some string"); //Good
//printf("4 %s\n",L"some string"); //Doesn't compile
printf("\n");
wprintf(L"1 %S\n","some string"); //Not good -> print some funny stuff
wprintf(L"2 %S\n",L"some string"); //Good
//printf("3 %S\n","some string"); //Doesn't compile
printf("4 %S\n",L"some string");  //Good

And I get the following output:

1 some string
2 s
3 some string

1 g1 %s

2 some string
4 some string

So: it seems that both wprintf and printf are able to print correctly both a char* and a wchar*, but only if the exact specifier is used. If the wrong specifier is used, you might not get a compiling error (nor warning!) and end up with wrong behavior. Do you experience the same behaviour?

Note: This was tested under Windows, compiled with MinGW and g++ 4.7.2 (I will check gcc later)

Edit: I also tried %ls (result is in the comments)

printf("\n");
wprintf(L"1 %ls\n","some string"); //Not good -> print funny stuff
wprintf(L"2 %ls\n",L"some string"); //Good
// printf("3 %ls\n","some string"); //Doesn't compile
printf("4 %ls\n",L"some string");  //Good
Antonio
  • 19,451
  • 13
  • 99
  • 197

7 Answers7

27

I suspect GCC (mingw) has custom code to disable the checks for the wide printf functions on Windows. This is because Microsoft's own implementation (MSVCRT) is badly wrong and has %s and %ls backwards for the wide printf functions; since GCC can't be sure whether you will be linking with MS's broken implementation or some corrected one, the least-obtrusive thing it can do is just shut off the warning.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 3
    it's not badly wrong...that's like saying what you did is wrong because you followed your own spec and not someone else's (ANSI) – jbu May 15 '14 at 18:57
  • 31
    @jbu: It means the language they support is not C, but some random C-like language Microsoft came up with. So if you say you support C, that's not just being different, it's *wrong*. – R.. GitHub STOP HELPING ICE Sep 10 '14 at 15:29
  • 3
    @R.. could you point to a reference stating that Microsoft itself claimed to fully comply with an ANSI C standard? The C language by itself without referring to a standard has no meaning at all. – Matthias Jan 29 '17 at 12:49
  • @Matthias Colin Robertson from Microsoft confirmed on GitHub that MSVC contains a C89/90 compliant compiler, with some features from newer versions. But regardless, I'm surprised you're suggesting `%ls` (literally long/wide string) and `%s` should have their meanings interchanged in _any_ compiler. – Benjamin Crawford Ctrl-Alt-Tut Jul 08 '20 at 16:55
  • @BenjaminCrawfordCtrl-Alt-Tut: I'm not suggesting that they *should* have their meanings swapped, but that MSVCRT *does* have their meanings swapped. With the MSVCRT `wprintf`, `%s` is for wide strings and `%ls` is for byte strings. This is completely contrary to the standard and common sense. – R.. GitHub STOP HELPING ICE Jul 09 '20 at 00:44
  • 1
    @R..GitHubSTOPHELPINGICE Check the mention, I was responding to Matthias. – Benjamin Crawford Ctrl-Alt-Tut Jul 09 '20 at 13:44
21

The format specifers matter: %s says that the next string is a narrow string ("ascii" and typically 8 bits per character). %S means wide char string. Mixing the two will give "undefined behaviour", which includes printing garbage, just one character or nothing.

One character is printed because wide chars are, for example, 16 bits wide, and the first byte is non-zero, followed by a zero byte -> end of string in narrow strings. This depends on byte-order, in a "big endian" machine, you'd get no string at all, because the first byte is zero, and the next byte contains a non-zero value.

endolith
  • 25,479
  • 34
  • 128
  • 192
Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • 1
    Why doesn't it give a warning at compile time? – Antonio Jul 17 '13 at 13:28
  • Depends on the compiler - gcc with suitable warning level should do. – Mats Petersson Jul 17 '13 at 13:29
  • I do have all warnings activated, but I am compiling with g++ (it's a complicated story), I will check later if it matters... – Antonio Jul 17 '13 at 13:42
  • Works for me: "text.cpp:5:24: warning: format ‘%s’ expects argument of type ‘char*’, but argument 2 has type ‘const wchar_t*’ [-Wformat]" – Mats Petersson Jul 17 '13 at 13:44
  • 1
    However, if you are compiling on a Windows system, it may be that the "wprintf" function isn't tagged as "this is a printf type function" in the header file, in which case gcc doesn't know that it should interpret the input as a format string (you wouldn't want to get a warning for `strcpy(blah, "%s");` for example, so it needs to know what is a `printf` style function from the header file). – Mats Petersson Jul 17 '13 at 13:46
  • Where is the specification that defines `%S`? – endolith Oct 20 '22 at 18:48
  • https://devblogs.microsoft.com/oldnewthing/20190830-00/?p=102823 – endolith Oct 20 '22 at 19:00
6

For s: When used with printf functions, specifies a single-byte or multi-byte character string; when used with wprintf functions, specifies a wide-character string. Characters are displayed up to the first null character or until the precision value is reached.

For S: When used with printf functions, specifies a wide-character string; when used with wprintf functions, specifies a single-byte or multi-byte character string. Characters are displayed up to the first null character or until the precision value is reached.

In Unix-like platform, s and S have the same meaning as windows platform.

Reference: https://msdn.microsoft.com/en-us/library/hf4y5e3w.aspx

user3581075
  • 71
  • 1
  • 2
  • 1
    `%S` is not to be found in the C standard (!): http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf and is only an extension that compilers implement... It should be discouraged to use it. – 71GA Feb 06 '20 at 08:59
  • In the second case the OP has posted, wprintf is being used with %s and a wchar string, isn't it? But only the first character is being printed. – The19thFighter Dec 03 '21 at 21:33
  • @71GA Then what should be used to print wide strings? – endolith Oct 20 '22 at 18:49
5

At least in Visual C++: printf (and other ACSII functions): %s represents an ASCII string %S is a Unicode string wprintf (and other Unicode functions): %s is a Unicode string %S is an ASCII string

As far as no compiler warnings, printf uses a variable argument list, with only the first argument able to be type checked. The compiler is not designed to parse the format string and type check the parameters that match. In cases of functions like printf, that is up to the programmer

Steve R
  • 51
  • 1
  • 1
1

%S seems to conform to The Single Unix Specification v2 and is also part of the current (2008) POSIX specification.

Equivalent C99 conforming format specifiers would be %s and %ls.

David Foerster
  • 1,461
  • 1
  • 14
  • 23
  • Can you provide more details, some link for example? – Antonio Jul 17 '13 at 13:27
  • 1
    10 seconds on [search engine]: http://pubs.opengroup.org/onlinepubs/007908799/xsh/fprintf.html – David Foerster Jul 17 '13 at 13:31
  • Well, there's "Single Unix Specification v2" right in the answer and your question relates to the "printf" family. There's your search query. – David Foerster Jul 17 '13 at 13:44
  • 1
    I recommend linking against POSIX 2008 (Open Group Issue 7, which corresponds to Single Unix Specification **v4**), instead of the ancient SUS V2. The URL would be: http://pubs.opengroup.org/onlinepubs/9699919799/functions/fprintf.html – MestreLion Mar 12 '15 at 08:26
  • Only `%s` but not `%S` is to be found in your *"POSIX specification"* https://pubs.opengroup.org/onlinepubs/9699919799/functions/fprintf.html – 71GA Feb 06 '20 at 09:02
  • 1
    @71GA: Yet I can find it there. Search for “Equivalent to `ls`”. – David Foerster Feb 07 '20 at 12:37
1

Answer A

None of the answers above pointed out why you might not see some of your prints. This is also because here you are dealing with streams (I didn't know this) and stream has something called orientation. Let me cite something from this source:

Narrow and wide orientation

A newly opened stream has no orientation. The first call to any I/O function establishes the orientation.

A wide I/O function makes the stream wide-oriented, a narrow I/O function makes the stream narrow-oriented. Once set, the orientation can only be changed with freopen.

Narrow I/O functions cannot be called on a wide-oriented stream; wide I/O functions cannot be called on a narrow-oriented stream. Wide I/O functions convert between wide and multibyte characters as if by calling mbrtowc and wcrtomb. Unlike the multibyte character strings that are valid in a program, multibyte character sequences in the file may contain embedded nulls and do not have to begin or end in the initial shift state.

So once you use printf() your orientation becomes narrow and from this point on you can't get anything out of wprintf() and you realy don't. Unless you use freeopen() which is intended to be used on files.


Answer B

As it turns out you can use freeopen() like this:

freopen(NULL, "w", stdout);             

To make stream "not defined" again. Try this example:

#include <stdio.h>
#include <wchar.h>
#include <locale.h>

int main(void)
{
    // We set locale which is the same as the enviromental variable "LANG=en_US.UTF-8".
    setlocale(LC_ALL, "en_US.UTF-8");

    // We define array of wide characters. We indicate this on both sides of equal sign
    // with "wchar_t" on the left and "L" on the right.
    wchar_t y[100] = L"€ο Δικαιοπολις εν αγρω εστιν\n";

    // We print header in ASCII characters
    wprintf(L"content-type:text/html; charset:utf-8\n\n");

    // A newly opened stream has no orientation. The first call to any I/O function
    // establishes the orientation: a wide I/O function makes the stream wide-oriented,
    // a narrow I/O function makes the stream narrow-oriented. Once set, we must respect
    // this, so for the time being we are stuck with either printf() or wprintf().

    wprintf(L"%S\n", y);    // Conversion specifier %S is not standardized (!)
    wprintf(L"%ls\n", y);   // Conversion specifier %s with length modifier %l is 
                            // standardized (!)

    // At this point curent orientation of the stream is wide and this is why folowing
    // narrow function won't print anything! Whether we should use wprintf() or printf()
    // is primarily a question of how we want output to be encoded.

    printf("1\n");          // Print narrow string of characters with a narrow function
    printf("%s\n", "2");    // Print narrow string of characters with a narrow function
    printf("%ls\n",L"3");   // Print wide string of characters with a narrow function

    // Now we reset the stream to no orientation.
    freopen(NULL, "w", stdout);

    printf("4\n");          // Print narrow string of characters with a narrow function
    printf("%s\n", "5");    // Print narrow string of characters with a narrow function
    printf("%ls\n",L"6");   // Print wide string of characters with a narrow function

    return 0;
}
71GA
  • 1,132
  • 6
  • 36
  • 69
1

From man fprintf

   C      (Not in C99 or C11, but in SUSv2, SUSv3, and SUSv4.)  Synonym for lc.  Don't use.

   S      (Not in C99 or C11, but in SUSv2, SUSv3, and SUSv4.)  Synonym for ls.  Don't use.

Thus, don't use %C or %S, use always %lc or %ls instead.

Hypoano
  • 11
  • 1