2

I'm running this bit of code to understand pointers a little better.

void foo(void)
{
    int a[4] = {0, 1, 2, 3};

    printf("a[0]:%d, a[1]:%d, a[2]:%d, a[3]:%d\n", a[0], a[1], a[2], a[3]);

    int *c;

    c = a + 1;
    c = (int *)((char*) c + 1);
    *c = 10;

    printf("c:%p, c+1:%p\n", c, c+1);
    printf("a:%p, a1:%p, a2:%p, a3:%p\n", a, a+1, a+2, a+3);

    printf("a[0]:%d, a[1]:%d, a[2]:%d, a[3]:%d\n", a[0], a[1], a[2], a[3]);

    printf("c[0]:%d, c[1]:%d\n", *c, *(c+1));

}

The output I get is:

a[0]:0, a[1]:1, a[2]:2, a[3]:3
c:0xbfca1515, c+1:0xbfca1519
a:0xbfca1510, a1:0xbfca1514, a2:0xbfca1518, a3:0xbfca151c
a[0]:0, a[1]:2561, a[2]:0, a[3]:3
c[0]:10, c[1]:50331648

Could someone please explain how a[1] is now 2561?

I understand that when we do this:

c = (int *) ((char *) c + 1);

c is now pointing to the 4 bytes following the first byte of a[1].

But how did a[1] end up with 2561?

I'm guessing this has to do with endianness?

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219

1 Answers1

6
c = a + 1;

now c points on 1 (second element of a)

c = (int *)((char*) c + 1);

You "cheated" with pointer arithmetic, adding 1 to the address, regardless of the size of the int (note that it is illegal on old machines like 68000 which don't tolerate multi-byte access to odd addresses, or will do the job, albeit a lot slower, which is kind of worse since you're not noticing it for instance it works on a 68020 but slower).

now c points on the 3 last bytes of a[1] and overflows on the first byte of a[2], so when you do:

*c = 10;

since your machine is little endian, you're leaving the leading 1 value, write 10 in the next location, and zeroes afterwards, clobbering the leading 2 byte of a[2]

So now:

 a[1] = 1 + (10<<8) = 2561
 a[2] = 0

the result is different on a big endian machine:

PowerPC big endian (if int is 32 bit, else it's a different result):

a[1] = 10485760
a[2] = 2   // first byte is overwritten, but with zero

68000/68010:

bus error (coredump) / guru meditation

to sum it up: Don't violate the strict aliasing rule

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • 3
    And many RISC machines don't like misaligned access. DEC Alpha converted a misaligned access into a system trap that either aborted the program or processed the read in several parts and mangled the data to assemble the answer. That was not fast — a misaligned memory access was something to avoid at all costs. – Jonathan Leffler Feb 13 '18 at 20:48
  • yes, some machines tolerate unalignment, but it's more costly, specially if it's emulated by software! (sorry for the 680x0 love BTW :)) – Jean-François Fabre Feb 13 '18 at 20:49
  • I still have a soft spot for the 680x0 chip set. PPC is an heir and successor via the RS6000 chips, I believe. The DEC Alpha natively objected to the misaligned access; it was a software 'fix' that allowed it to continue. Personally, I think that the crash was appropriate. – Jonathan Leffler Feb 13 '18 at 20:54
  • 1
    They "fixed" the misaligned access on 68020 (68000/68010 crash on odd access), but that means that codes for 68020 using only 68000 instruction may only run on ... 68020... how sick is that :) – Jean-François Fabre Feb 13 '18 at 20:55