3
#include <stdio.h>

int main()
{
    printf("%c\n", 'abcd');
    printf("%p\n", 'abcd');
    printf("%c\n", 0x61626364);
    printf("%c\n", 0x61626363);
    printf("%c\n", 0x61626365);
    return 0;
}

I want to ask this line : printf("%c\n", 'abcd');
In this line, the result is 'd' but, I can't understand why 'd' is come out.
I tried to look other memories. In this situation, I found other memories have all alphabets.
Please explain me why result is 'd' and why other memories have all alphabets.
Thank you.

M.M
  • 138,810
  • 21
  • 208
  • 365
Tenny
  • 35
  • 3
  • 1
    @MattMcNabb That doesn't make sense. The value of `'abcd'` is implementation-defined, and an implementation may make its value `0x61626364`. If `printf("%c\n", 0x61626364)` is undefined, and `printf("%c\n", 'abcd')` is allowed to have the same result as `printf("%c\n", 0x61626364)`, then `printf("%c\n", 'abcd')` must also be undefined. (Or, to be more precise, it's implementation-defined whether the behaviour is undefined.) –  Oct 31 '14 at 08:26
  • Why do you ask? Did you enable all warnings at compile time? On which compiler, operating system, runtime are you testing it. – Basile Starynkevitch Oct 31 '14 at 08:29
  • @DevSolar `%c` does not take character literals. That wouldn't be remotely useful. If you have a character literal, just put it in the format string. `%c` is designed for `char` values, and `'abcd'` is (on many implementations) a value that does not fit into a `char`. –  Oct 31 '14 at 08:35
  • 1
    @DevSolar And actually checking the standard, `%c` takes `int` values and converts them to `unsigned char`. The conversion of `0x61626364` to `unsigned char` is well-defined (and depends on the range of `unsigned char`, but assuming 8-bit characters, is required to give `0x64`), so `printf("%c\n", 0x61626364)` is perfectly valid. –  Oct 31 '14 at 08:38
  • http://stackoverflow.com/questions/3960954/c-multicharacter-literal http://stackoverflow.com/questions/7755202/multi-character-constant-warnings – phuclv Oct 31 '14 at 08:41
  • http://stackoverflow.com/questions/7459939/what-do-single-quotes-do-in-c-when-used-on-multiple-characters – phuclv Oct 31 '14 at 08:41
  • @hvd that's what Keith calls "implementation-undefined" – M.M Oct 31 '14 at 08:49
  • @MattMcNabb I like that name, thanks, I'll try to remember it. But as I now checked and posted in my answer, actually, all the `%c` ones are valid. –  Oct 31 '14 at 08:51
  • Thanks to everyone. I aware that %c is not fit for char literals but in a textbook, that was a example. I failed understand that code, so I asked to here. Now I got a keyword "implementation-undefined". – Tenny Nov 02 '14 at 18:07

4 Answers4

4

'abcd' is a multi-character constant, its value is implementation-defined.

C11 §6.4.4.4 Character constants section 10

An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined. If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.

A common implementation gives 'abcd' a value of 'a' * 256 * 256 * 256 + 'b' * 256 * 256 + 'c' * 256 + 'd' (1633837924), you can check its value in your implementation by printing it using "%d". Although legal C, it's rarely used in practice.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
  • Thanks for this write. Could I ask which book is this? – Tenny Nov 02 '14 at 18:24
  • @Tenny The quote is from the C standard. You might want to read [Where do I find the current C or C++ standard documents?](http://stackoverflow.com/questions/81656/where-do-i-find-the-current-c-or-c-standard-documents). – Yu Hao Nov 03 '14 at 01:28
1

Your code is wrong. When you compile it with a recent GCC compiler enabling warnings with

gcc -Wall -Wextra u.c

you get

 u.c: In function 'main':
 u.c:5:20: warning: multi-character character constant [-Wmultichar]
      printf("%c\n", 'abcd');
                     ^
 u.c:6:20: warning: multi-character character constant [-Wmultichar]
      printf("%p\n", 'abcd');
                     ^
 u.c:6:5: warning: format '%p' expects argument of type 'void *', but argument 2 has type 'int' [-Wformat=]
      printf("%p\n", 'abcd');
      ^

Technically, you are in the awful undefined behavior case (and unspecified behavior for the multi-character constants), and anything could happen with a standard compliant implementation.

I never saw any useful case for multi-character constants like 'abcd'. I believe they are useless and mostly are an historical artefact.

To explain what really happens, it is implementation specific (depends upon the compiler, the processor, the optimization flags, the ABI, the runtime environment, ....) and you need to dive into gory details (first look at the generated assembler code with gcc -fverbose-asm -S) and into your libc particular printf implementation.

As a rule of thumb, you should improve your code to get rid of every warnings your compiler is able to give you (your compiler is helpful in warning you). They are few subtle exceptions (but then you should comment your code about them).

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
1
printf("%c\n", 'abcd');

As noted already, the value of 'abcd' is implementation-defined. On your implementation, its value is 0x61626364, so it behaves the same as your third printf call. See below.

printf("%p\n", 'abcd');

As noted already, %p is used to print pointers. 'abcd' is not a pointer, so this call is simply invalid.

printf("%c\n", 0x61626364);
printf("%c\n", 0x61626363);
printf("%c\n", 0x61626365);

The specification for %c reads:

If no l length modifier is present, the int argument is converted to an unsigned char, and the resulting character is written.

Conversions of int to unsigned char are well-defined and reduce the value modulo UCHAR_MAX+1. On most implementations, this means it takes the lowest 8 bits of the number.

The lowest 8 bits of 0x61626364, 0x61626363 and 0x61626365 are 0x64, 0x63 and 0x65, which in ASCII correspond to 'd', 'c' and 'e', so ASCII implementations will print those characters.

  • Which version of "the specification" ? – M.M Oct 31 '14 at 08:54
  • @MattMcNabb C99, but the same text is present in [a draft of C90](http://flash-gordon.me.uk/ansi.c.txt), except for the bit about `l`. –  Oct 31 '14 at 08:54
  • @hvd I think 'acbd' has a memory and I can use %p to that place. I wonder if a variable that declared by a pointer, than using %p is right? – Tenny Nov 02 '14 at 18:22
  • @Tenny `'abcd'` is always an integer constant value, not a pointer, and if it were a pointer then it wouldn't work to print it with `%c`. You could store that integer constant value in a variable and print the address of that variable with `%p`, but you're not doing that. –  Nov 02 '14 at 21:13
0

Your code

printf("%c\n", 'abcd');

results in output

d

due to the "%c" specifying a single character. Because multiple characters were provided instead of a single character, the multi-character constant was 'converted' to a single character by taking the last character of the string.

The result of providing a string where a single character is expected is the "implementation-defined" behavior. This means different compilers can handle this differently. See stackoverflow.com/multiple-characters-in-a-character-constant.

Community
  • 1
  • 1
Forgen
  • 48
  • 5
  • `'abcd'` isn't a string; it's a character. Strings are indicated by double quotes. – M.M Oct 31 '14 at 08:51
  • Ahh thank you Matt. I'm used to C# & JavaScript where 'character-literals' can be used (more or less) interchangeably with "strings"; and single character strings are automatically cast as characters, and vice versa. – Forgen Oct 31 '14 at 10:18
  • I need to find what is "implementation-defined". Thank you for your answer. – Tenny Nov 02 '14 at 18:16
  • Hi @Tenny. "Implementation-defined" simply means you cannot be sure _how_ the result will come out if you run your code on different machines or compile with different compilers. Therefore, it is advised *not* to write the code as such, and when you use %c only pass it a single character, ie. 'd' instead of 'abcd'. This way you will have a predictable result on any implementation. – Forgen Nov 03 '14 at 02:43