0

I am trying to understand the below observation while using memcpy(),

#include <stdio.h>
#include <stdint.h>
#include <string.h>

int main()
{
    uint8_t arr[4] = {0x11, 0x22, 0x33, 0x44};
    
    uint32_t cpy;
    memcpy(&cpy, arr, 4);
    printf("%x\n", cpy);
    uint8_t *temp = (uint8_t *)&cpy;
    printf("&cpy[0]=%p, val=%x\n", temp, *temp);
    printf("&cpy[1]=%p, val=%x\n", temp+1, *(temp+1));
    printf("&cpy[2]=%p, val=%x\n", temp+2, *(temp+2));
    printf("&cpy[3]=%p, val=%x\n", temp+3, *(temp+3));
    
    uint8_t cpy2[4];
    memcpy(cpy2, arr, 4);
    for(int i=0; i<4; i++)
    {
        printf("%x ", cpy2[i]);   
    }
    printf("\n");
    printf("&cpy2[0]=%p\n", &cpy2[0]);
    printf("&cpy2[1]=%p\n", &cpy2[1]);
    printf("&cpy2[2]=%p\n", &cpy2[2]);
    printf("&cpy2[3]=%p\n", &cpy2[3]);
    
    return 0;
}

Below is the output I got,

44332211
&cpy[0]=0x7ffdc763c020, val=11
&cpy[1]=0x7ffdc763c021, val=22
&cpy[2]=0x7ffdc763c022, val=33
&cpy[3]=0x7ffdc763c023, val=44
11 22 33 44
&cpy2[0]=0x7ffdc763c034
&cpy2[1]=0x7ffdc763c035
&cpy2[2]=0x7ffdc763c036
&cpy2[3]=0x7ffdc763c037

Why is the output of the printf statement for the cpy variable reversed? Isn't the MSB (0x44) at a higher address than the LSB (0x11), so, shouldn't the output be 11223344?

  • 3
    On most machines, the LSB is at a lower address — this is called "little endian". But in the number `0x44332211`, `44` is the MSB or most-significant byte, and `11` is the LSB. See also the Wikipedia article on [Endianness](https://en.wikipedia.org/wiki/Endianness). – Steve Summit Sep 19 '22 at 10:22
  • 1
    Welcome to the wonderful world of [endianness](https://en.wikipedia.org/wiki/Endianness)! The story behind the term is funny, too. – the busybee Sep 19 '22 at 10:26
  • @SteveSummit Certainly _many_ are _little endian_. "most machines" today are little embedded ones - billions per year. I wonder is _most_ really applies. – chux - Reinstate Monica Sep 19 '22 at 10:26
  • 1
    Seems you are contradicting yourself here: `Isn't the MSB (0x44) at a higher address`... `so, shouldn't the output be 11223344`. If `0x44` is the MSB, then surely you should expect `44332211`, as you get. – 500 - Internal Server Error Sep 19 '22 at 10:26
  • @500-InternalServerError I see. I meant that the MSB in the output is 0x44 but it is at a higher address. I assumed that printf() would print from lower address to higher address and hence was expecting 0x44 to be the LSB. – Kaustubh Shankar Sep 19 '22 at 10:49
  • In this case `printf()` is printing the value of a `uint32_t` variable - the variable's internal representation is irrelevant in that context. Naturally, `printf()` must logically work the same on little endian and big endian systems. – 500 - Internal Server Error Sep 19 '22 at 10:51
  • Save yourself lots of typing. `typedef union { uint32_t i; uint8_t c[4]; } v_t;` You can define one of these, then `v_t x.i = intVal;` and then work with `x.c[0]`, etc. to examine each byte. – Fe2O3 Sep 19 '22 at 11:37

1 Answers1

1

On a little-endian machine such as you're using, multi-byte numbers do indeed seem to be reversed. When we write a number like 0x44332211 in conventional left-to-right order, we write the most-significant byte (44) first, on the left, then proceed to the less-significant bytes: 33, 22, and finally the least-significant 11.

But on a little-endian machine, the least-significant byte is stored first. Stated another way, if we write

uint32_t num = 0x44332211;
uint8_t *p = &num;

the pointer p points to the byte 0x11, and pointers past p (p+1, p+2, and p+3) point at 22, 33, and 44.

But if we think of memory addresses as increasing from left to right, that means that if we write the bytes in memory order (or if we print them out as an array of unsigned char, or do the obvious pointer arithmetic) we see

11 22 33 44

which is, yes, backwards from 0x44332211.

Despite its ubiquity, this pattern is easy to forget and often surprising. It's just something you have to get used to.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103