6

I have the following code in my program:

char ch='abcd';
printf("%c",ch);

The output is d.

I fail to understand why is a char variable allowed to take in 4 characters in its declaration without giving a compile time error.

Note: More than 4 characters is giving an error.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
user3152736
  • 129
  • 1
  • 5
  • 1
    see http://stackoverflow.com/questions/7755202/multi-character-constant-warnings – tristan Jan 02 '14 at 07:30
  • And [here](http://ideone.com/VHaSY0) I see that even more `char`s can be appended and always the last one is being printed. – Sufian Latif Jan 02 '14 at 07:31
  • I'm using visual studio and its allowing only 4 characters. Anyways I did read about the acceptability of multi-char constants and I think different IDEs have their own limit... – user3152736 Jan 02 '14 at 07:48
  • @tristan How does this help OP? The question and answers are about `int` and "bigger" types. Are you implying that `char` gets expanded into an `int` or something? It cannot pack a multi-char into `char` type otherwise with out memory overflow. I think it's just compiler behaviour, skipping everything but the last character from the multi-char.Can you elaborate how the linked question relates to this one, besides asking about the multi-char constants? – luk32 Jan 02 '14 at 07:49
  • @luk32 i thought that answer helps with the explanation of multi character constants. – tristan Jan 02 '14 at 09:07
  • @tristan Yea, I thought that it was apparent that OP knew he is using multi-char cosntant. Well I guess, it wasn't. Still I think that it is very peculiar and interesting, why is 4-char mulitbyte constant allowed to be packed into a single byte `char`. I thought that this is what the question was really about. – luk32 Jan 02 '14 at 09:10

4 Answers4

7

'abcd' is called a multicharacter constant, and will has an implementation-defined value, here your compiler gives you 'd'.

If you use gcc and compile your code with -Wmultichar or -Wall, gcc will warn you about this.

Lee Duhem
  • 14,695
  • 3
  • 29
  • 47
  • Correct. A single character is normally expected for '-enclosed text. Anything beyond that (e.g., a 32 bit register/word that happens to hold 4 8-bit characters) is nonstandard and nonportable. If you _really_ want to handle 4 characters (single or multibyte) at a time, you should be using a string ("-enclosed). – Phil Perry Jan 02 '14 at 16:53
  • @Phil Perry `'abcd'`, as an integer character constant, _is_ standard per C11 6.4.4.4 10. Portability problems ensue because its value is implementation-defined. – chux - Reinstate Monica Jan 03 '14 at 00:19
1

I fail to understand why is a char variable allowed to take in 4 characters in its declaration without giving a compile time error.

It's not packing 4 characters into one char. The multi-character const 'abcd' is of type int and then the compiler does constant conversion to convert it to char (which overflows in this case).

tristan
  • 4,235
  • 2
  • 21
  • 45
0

Assuming you know that you are using multi-char constant, and what it is.

I don't use VS these days, but my take on it is, that 4-char multi-char is packed into an int, then down-casted to a char. That is why it is allowed. Since the packing order of multi-char constant into an integer type is compiler-defined it can behave like you observe it.

Because multi-character constants are meant to be used to fill integer typed, you could try 8-byte long multi-char. I am not sure whether VS compiler supports it, but there is a good chance it is, because that would fit into a 64-bit long type.

It probably should give a warning about trying to fit a literal value too big for the type. It's kind of like unsigned char leet = 1337;. I am not sure, however, how does this work in VS (whether it fires a warning or an error).

luk32
  • 15,812
  • 38
  • 62
0

4 characters are not being put into a char variable, but into an int character constant which is then assigned to a char.

3 parts of the C standard (C11dr §6.4.4.4) may help:

  1. "An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in 'x'."

  2. "An integer character constant has type int."

  3. "The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined."

OP's code of char ch='abcd'; is the the assignment of an int to a char as 'abcd' is an int. Just like char ch='Z';, ch is assigned the int value of 'Z'. In this case, there is no surprise, as the value of 'Z' fits nicely in a char. In the 'abcd', case, the value does not fit in a char and so some information is lost. Various outcomes are possible. Typically on one endian platform, ch will have a value of 'a' and on another, the value of 'd'.


The 'abcd' is an int value, much like 12345 in int x = 12345;.

When the size(int) == 4, an int may be assigned a character constant such as 'abcd'.

When size(int) != 4, the limit changes. So with an 8-char int, int x = 'abcdefgh'; is possible. etc.

Given that an int is only guaranteed to have a minimum range -32767 to 32767, anything beyond 2 is non-portable.

The int endian-ness of even int = 'ab'; presents concerns.


Character constant like 'abcd' are typically used incorrectly and thus many compilers have a warning that is good to enable to flag this uncommon C construct.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256