What is happening here?
#include <stdio.h>
int main (void)
{
int x = 'HELL';
printf("%d\n", x);
return 0;
}
Prints 1212501068
I expected a compiling error.
Explanations are welcome =)
What is happening here?
#include <stdio.h>
int main (void)
{
int x = 'HELL';
printf("%d\n", x);
return 0;
}
Prints 1212501068
I expected a compiling error.
Explanations are welcome =)
1212501068
in hex is 0x48454c4c
.
0x48
is the ASCII code for H
.0x45
is the ASCII code for E
.0x4c
is the ASCII code for L
.0x4c
is the ASCII code for L
.Note that this behaviour is implementation-defined and therefore not portable. A good compiler would issue a warning:
$ gcc test.c
test.c: In function 'main':
test.c:4:11: warning: multi-character character constant [-Wmultichar]
In C, single quotes are used to denote characters, which are represented in memory by numbers. When you place multiple characters in single quotes, the compiler combines them in a single value however it wants, as long as it documents the process.
Looking at your number, 1212501068 is 0x48454C4C. If you decompose this number into bytes, you get 48
or 'H', 45
or 'E' and twice 4C
or 'L'
The output of 1212501068 as hex is: 0x48 0x45 0x4C 0x4C
Look it up in an ASCII table, and you'll see those are the code for HELL
.
BTW: single-quotes around a multi-char value are not standardized.
The exact interpretation of single-quotes around multiple characters is Implementation-Defined. But it is very common that it either comes out as a Big-Endian or Little-Endian integer. (Technically, the implementation could interpret it any way it chooses, including a random value).
In otherwords, depending on the platform, I would not be surprised to see it come out as:0x4C 0x4C 0x45 0x48
, or 1280066888
And over on this question, and also on this site you can see practical uses of this behavior.
Others have explained what happened. As for the explanation, I quote from C99 draft standard (N1256):
6.4.4.4 Character constants
[...]
An integer character constant has type
int
. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g.,'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined. If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.
The emphasis on the relevant sentence is mine.
Line:
int x = 'HELL';
save to memory hex values of 'HELL' and it is 0x48454c4c == 1212501068.
The value is just 'HELL' interpreted as an int
(usually 4 bytes).
If you try this:
#include <stdio.h>
int main (void)
{
union {
int x;
char c[4];
} u;
int i;
u.x = 'HELL';
printf("%d\n", u.x);
for(i=0; i<4; i++) {
printf("'%c' %x\n", u.c[i], u.c[i]);
}
return 0;
}
You'll get:
1212501068
'L' 4c
'L' 4c
'E' 45
'H' 48