0

I'm a university student, our teacher just asked us what is the output of this program and why

#include <stdio.h>

int main(){
    int x = 1023;
    char *p = (char*)&x;

    printf("%d %d %d %d\n", p[0], p[1], p[2], p[3]);
}

the output is -1 3 0 0 but I don't know why. I've done some research and found out integers in C/C++ are stored as HEX and are put in 4 bytes of memory, for example, 1023 is stored as 00 00 03 FF. What I don't understand is why FF become -1, and why is it reversed, I think it should be 0 0 3 -1. And also, I don't know what is happening when you cast an int address to a char pointer (or char array?)

char *p = (char*)&x;
NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • 6
    You can read about [endianness](https://en.wikipedia.org/wiki/Endianness) and about [two's complement](https://en.wikipedia.org/wiki/Two%27s_complement). The first link describes the order of the bytes and the second link describes why `FF` is converted to `-1` – Thomas Sablik Apr 30 '20 at 13:02
  • Try this: `printf("%02X %02X %02X %02X\n", p[0] & 0xFF, p[1] & 0xFF, p[2] & 0xFF, p[3] & 0xFF);` – Eljay Apr 30 '20 at 13:07
  • `p[x]` is syntactical sugar for `*(p + x)`. It's equivalent. `p[0] == *p`, `p[1] == *(p + 1)`, ... – Thomas Sablik Apr 30 '20 at 13:07
  • 1
    The output depends on the machine that executes the program. `-1 3 0 0`, `255 3 0 0`, `0 0 3 255`, `0 0 3 -1` are all possible outputs. – Support Ukraine Apr 30 '20 at 13:16
  • A big-endian machine would store the 4 bytes of the `int` as 00 00 03 FF, but a little-endian machine stores the 4 bytes of the `int` as FF 03 00 00. The byte FF can be interpreted as a signed, 2's complement, 8-bit integer with value -1, so we can conclude that your machine's `char` type is a signed, 2's complement, 8-bit integer and that your machine uses little-endian byte order. – Ian Abbott Apr 30 '20 at 13:16
  • Does this answer your question? [Types of endianness](https://stackoverflow.com/questions/21449/types-of-endianness) – phuclv Feb 06 '21 at 05:48
  • [byte order when casting int to byte array](https://stackoverflow.com/q/27997319/995714), [why are the bytes in byte array reversed](https://stackoverflow.com/q/32251746/995714) – phuclv Feb 06 '21 at 05:51

3 Answers3

4

There's lots of reliance on poorly-defined behavior in your example.

  • The output depends on CPU endianess, so you will either get the raw values 00 00 03 FF on a big endian machine, or FF 03 00 00 on a little endian machine.

  • The signedness of char is implementation-defined, so you cannot portably tell if the raw value FF will result in a positive or negative number when stored in a char. Therefore, you should never use char for the purpose of displaying raw data. Use uint8_t instead.

  • And finally, if char does happen to be signed, FF will get converted into -1 on a 2's complement system, but in theory C allows other forms of signedness as well. (Formally, the program may also refuse the conversion and raise a signal if it deems the signed value out of bonds.)


What happens in your case is that you run this on a little endian, 2's complement machine with a signed char compiler. The raw data is stored as FF 03 00 00 in little endian, and when interpreted as signed char, FF turns out as -1 on 2's complement computers.

All parameters passed to printf get implicitly converted to (signed) int and the %d tells the function to treat them as int as well. When this happens, the negative number -1 gets silently "sign extended" from FF into FF FF FF FF to preserve the decimal value -1.

So you get -1 3 0 0 when you print the data as integers.

Lundin
  • 195,001
  • 40
  • 254
  • 396
3

I've done some research and found out integers in C/C++ are stored as HEX

Wrong. Hex is a representation of values.

and are put in 4 bytes of memory, for example, 1023 is stored as 00 00 03 FF.

This is (partially) right.

Let's assume we have 32 bit integers. Then the value 1023, which is 512 + 256 + 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1, is represented as 0b0000001111111111 in binary and, consequently, 0x000003FF in hex.

Now we have to choose: if we have little endian, it is stored as FF 03 00 00, in big endian, we have 00 00 03 FF. (Note that there are others possibilities for a multi-byte value to be arranged, but these are the most common ones.)

These bytes now (represented as char in the most usual implementations of C) can either be signed or unsigned. In many implementations, a char is signed (if it isn't preceded by unsigned, reading unsigned char). In these cases, a set highest bit denotes a negative number (if we limit us to two's complement implementations), and the range 80 to FF is mapped to -128 to -1. Thus, FF is shown as -1.

glglgl
  • 89,107
  • 13
  • 149
  • 217
  • Hex 80 to hex FF is not decimal −127 to −1. (Also, there are other possibilities for byte order than just big endian and little endian, and C does not limit the `char` type to eight bits.) – Eric Postpischil Apr 30 '20 at 14:31
  • @EricPostpischil A byte with hex 80 is indeed represented as -127 and vice versa if we assume an 8 bit signed char with two's complement representation of negative values. As well, a byte with hex FF is represented as -1 and vice versa. So I don't see where I am wrong. About the rest of your comment: I added some clarifying words, but I don't see their relevance here as the OP obviously has an implementation with the assumed properties (LE, 2s complement etc.) – glglgl Apr 30 '20 at 14:37
  • Hex 80 in eight-bit two’s complement represents −128, not −127. – Eric Postpischil Apr 30 '20 at 14:38
  • @EricPostpischil Ouch! Thanks for that. There you are indeed right. – glglgl Apr 30 '20 at 14:39
0

Values are stored as bits. Hexadecimal format is one way of displaying the value stored in those bits, as is decimal, octal, and binary.

Assuming a 32-bit int type, the value 102310 is stored as the sequence of bits 00000000 00000000 00000011 11111111, the hexadecimal representation of which is 0x000003FF.

A value like this requires multiple bytes for storage. Most systems like x86 store multi-byte values such that the least significant byte comes first, known as "little-endian" order. Other systems store multi-byte values such that the most significant byte comes first, known as "big-endian" order. Assuming our integer 1023 starts at address p, its bytes will be addressed as shown below in each system:

   big-endian:  p[0] p[1] p[2] p[3]
               +----+----+----+----+
               | 00 | 00 | 03 | FF |
               +----+----+----+----+
little-endian:  p[3] p[2] p[1] p[0]

That's why on your system the display goes from -1 3 0 0 instead of 0 0 3 -1.

As for why FF displays as -1...

There are several different ways to represent signed integer values, but one thing they all have in common is that the leftmost bit is used to indicate sign. If the leftmost bit is 0, the value is positive. If the leftmost bit is 1, the value is negative. Assuming a 3-bit type, they work out like this:

Bits    Two's Complement    Ones' Complement    Sign-Magnitude    Unsigned
----    ----------------    ----------------    --------------    --------
 000                   0                   0                 0           0
 001                   1                   1                 1           1
 010                   2                   2                 2           2
 011                   3                   3                 3           3
 100                  -4                  -1                -0           4
 101                  -3                  -2                -1           5
 110                  -2                  -3                -2           6 
 111                  -1                  -0                -3           7

x86 (along with the vast majority of other systems) uses two's complement to represent signed integer values, so an integer with all bits set is interpreted as -1.

When you use %d in the printf call, you're telling printf to treat the corresponding value as a signed int and to format it as a sequence of decimal digits. Hence, the byte containing FF is formatted as -1 on a two's complement system.1

Note that the value stored in x is 102310 (3ff16) regardless of how the bytes are ordered or how signed integers are represented. If you print out the hex representation of the value of x using

printf( "%08X\n", x ); // format output as hexadecimal

it will be displayed as 0x000003FF, not 0xFF030000.


  1. It's actually a little more complicated than that - the value in p[0] is first converted from char to int, and in order to preserve the sign, that converted value is 0xFFFFFFFF, which is what actually gets passed to printf.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • Re “Values are stored as bits. Hexadecimal format is one way of displaying…”: Values are stored as voltages, magnetic fields, or other physical phenomenon. We arbitrarily label some of those as 0 and some as 1. It is no more unreasonable to label groups of four of them as 0-F and call them hexadecimal than it is to label individual ones 0-1 and call them binary. They are neither but are easily managed and conceptualized as either. We do not need to hammer on students so much on this. – Eric Postpischil Apr 30 '20 at 14:35