0

I have a small snippet:

int main()
{
        unsigned long long x = 0x0000000000008342;
        unsigned long long * x_p = &x;
        unsigned int * y_p = (unsigned int *)x_p;
        unsigned int y = *y_p;
        printf("y = %#.8x\n", y);
}

With little endian, this is how a number is stored.

0x00: 42 83 00 00
0x04: 00 00 00 00

But instead of 0x00004283, why does it print 0x00008342 ?

Curio
  • 1,331
  • 2
  • 14
  • 34
Milan
  • 1,447
  • 5
  • 19
  • 27
  • The only value stored in memory is the value that `a` is initialized with. The compiler will not store integer literals like `0x7788aabb`. And as you truncate the number of bits in the initialization, only `0x0000aabb` will be stored. – Some programmer dude Oct 19 '21 at 07:15
  • 1
    Also note that there's no endianness involved here, the cast will truncate to 16 lowest bits no matter endianness. And those 16 lowest bits will always be `0xaabb`. – Some programmer dude Oct 19 '21 at 07:17
  • The value of `a` could differ on machines with 16-bit int – M.M Oct 19 '21 at 07:18
  • By the way, technically you have *undefined behavior* in your code. The `%x` format expects an *unsigned* integer argument (i.e. `unsigned int`). – Some programmer dude Oct 19 '21 at 07:19
  • 2
    The notation you use for integers in C, does not rely on the type of endian on the hardware you use. Truncating a long to a short in C, is also not dependent on the type of endian on your hardware. If you take the address of an integer, and access the bytes it consists of, then it matters. – GoWiser Oct 19 '21 at 07:20
  • Why would you think the entire number will be stored given that it doesn't fit into a `uint16_t`? – David Schwartz Oct 19 '21 at 07:26
  • Did you try this on a machine with big endianness, too? Did it produce another result? – the busybee Oct 19 '21 at 07:30
  • Endianness is only related to the way data is stored in memory. Once data is read into the CPU there is no such thing as endianness. In other words - `printf` knows nothing about endianness. Your code will give the same result on both little and big endian systems. – Support Ukraine Oct 19 '21 at 07:32
  • @4386427, in case of little endian, least significant byte are stored first , and before the most significant byte, .i.e bbaa8877. By fetching, I meant reading first two bytes. – Milan Oct 19 '21 at 07:42
  • @4386427, as suggested above have updated the snippet to take the address of integer value but still have doubt how print statement works here ? – Milan Oct 19 '21 at 07:53
  • Ok, looks like it printing 42 only as expected but due to format specifier 42 is shifted to lasr. – Milan Oct 19 '21 at 07:57
  • If the value of `y` is `0x00000042` then that's what `printf` will print. The issue isn't really with `printf`, it's how you extract the data into the variables you then happen to pass to `printf`. – Some programmer dude Oct 19 '21 at 07:59
  • And now you also break strict aliasing through the `y_p` pointer. – Some programmer dude Oct 19 '21 at 07:59
  • 1
    I'm not sure what is confusing you. You seem to be well aware that during write on a little endian machine the least significant byte goes to the lowest address. Fine. So why are you surprised that during read the least significant byte is read from the lowest address. I mean it is a symetric operation. – Support Ukraine Oct 19 '21 at 08:00
  • Sorry, I am kind of updating the snippet on the fly (as I am just trying to understand the about printf statement), for instance I was expecting to print 0x00004283 but it prints 0x00008342. – Milan Oct 19 '21 at 08:11
  • @Milan Did you read my comment just above? The least significant byte is at the lowest address. so 42 is the least significant byte. Then comes 83 so the print will be 8342 – Support Ukraine Oct 19 '21 at 08:14

2 Answers2

1

Endianess applies to all integer types larger than one byte. I don't understand why you think endianess only applies to your 64 bit integer type but not to int... This seems to be the source of your confusion.

A major problem with your code is that unsigned int y = *y_p; is undefined behavior. What is the strict aliasing rule? So your program can't be assumed to have deterministic behavior.

If you want to actually print the bytes as stored in memory, you need to do so byte by byte:

for(size_t i=0; i<sizeof x; i++)
  printf("%.2X ", ((unsigned char*)&x)[i] );

This conversion is fine since character types are a special exception to strict aliasing. It prints 42 83 00 00 00 00 00 00 as expected.

Also you have minor a cosmetic issue here: 0x0000000000008342. Don't write leading zeroes like that thinking you've written a 64 bit integer constant. The type of this one is actually int (32 bit system) or unsigned int (8/16 bit system). The leading zeroes don't do a thing, you need to an append ull suffix. This isn't an issue in this particular snippet, but could become one if you bring this style into real programs.

Lundin
  • 195,001
  • 40
  • 254
  • 396
0

printf("%x\n", unsigned_number); will print the value of unsigned_number in hexadecimal just like printf("%u\n", unsigned_number); will print it in decimal.

Neither is concerned with the internal representation of unsigned_number. printf("%x",...) doesn't do a hexdump (unless of course you call it in a loop on every char of your data).

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142