Not understanding the behavior of compiler

Question

I want to understand why the below code actually works and not give a seg fault. I had one of my colleague show me this and i was just surprised.

Can someone explain and point me to some good links to bridge my understanding of this?

struct Test {
    int __in;
    int __in1;
};

int main()
{
    struct Test* t = NULL;
    int i = &(t->__in1) + 4;
    std::cout << i << std::endl;
}

arun@arun-desktop:~/Code$ g++ -fpermissive -g test8.cc 
test8.cc: In function ‘int main()’:
test8.cc:11:24: warning: invalid conversion from ‘int*’ to ‘int’ [-fpermissive]
arun@arun-desktop:~/Code$ ./a.out   
20
arun@arun-desktop:~/Code$

score 8 · Answer 1 · edited May 23 '17 at 10:31

You'll only get a segmentation fault if you attempt to access invalid memory. Your code just performs pointer arithmetic, adjusting a pointer to Test to get a pointer to one of its members, and doesn't read or write to the pointer's target.

It's still undefined behaviour. Don't do this at home, kids.

(Also, don't use reserved names like __in1. And don't use -fpermissive to allow nonsensical conversions like this: the type system is there to help you.)

score 2 · Answer 2 · answered Mar 03 '14 at 15:32

struct Test {
    int __in;
    int __in1;
};

unsigned int fun ( void )
{
    struct Test* t=NULL;
    unsigned int i = (unsigned int)(&(t->__in1)) + 4;
    return(i);
}

unsigned int fun2 ( void )
{
    struct Test t;
    unsigned int i = (unsigned int)(&(t.__in1)) + 4;
    return(i);
}

I modified your code a little, in part to help out with the warning/error. In the first case you have a pointer that has no memory behind it, so it has no elements. You have pointed it at NULL. You need to point it at something other than null (so the math is NULL or zero plus an offset of 4 plus 4), but that wont fix the problem.

In the second case there is some memory behind it there is an actual structure there allocated by the compiler on the stack. So I get this:

00000000 <fun>:
   0:   e3a00008    mov r0, #8
   4:   e12fff1e    bx  lr

00000008 <fun2>:
   8:   e24dd008    sub sp, sp, #8
   c:   e28d0008    add r0, sp, #8
  10:   e28dd008    add sp, sp, #8
  14:   e12fff1e    bx  lr

So there is a little hope of code like that giving you an address you could use.

When you build your program with the appropriate type casts and command line options to get past the errors it will output an 8 as well.

I dont see anything wrong with doing address math like this, not for this struct but I can see some use cases for such a thing. You should be able to get the address of an item in a struct and you should be able to do address math with that address, there is nothing illegal about that kind of math.

Here try this for example (just took a little googling and finding another stackoverflow question):

//g++ -std=c++11 ptr.c -o ptr
#include <iostream>
#include <cstdint>
struct Test {
    int __in;
    int __in1;
};

int main()
{
    struct Test t;
    intptr_t i = (intptr_t)(&(t.__in1))-(intptr_t)(&t) + 4;
    std::cout << i << std::endl;
}

and the result comes out as an 8 on my machine...understand that there is no reason why it should be the same on your machine, you should never, ever, rely on how the compiler constructs structs and their sizes.

Not understanding the behavior of compiler

2 Answers2