-1

I came across a piece of code where NULL is typecast to an structure pointer type (foo *) 0, and with that pointer de-referencing a member ((foo *)0)->m, and using address of that &(((foo *)0)->m)) and type casting it to integer to get the memory index of that member with in the structure.((unsigned int)(&(((foo *)0)->m))).

To my knowledge NULL pointer dereference should always result a segmentation fault in C. But I don't understand how NULL pointer can be de-referenced like this and still not result in a segmentation fault.

#include <stdio.h>
#define  MACRO(m) ((unsigned int)(&(((foo *)0)->m)))

typedef struct
{
int a;
int b;
int c;
int d;
}foo;

int main(void) {
    printf("\n  %d  ", MACRO(c));
    return 0;
}
nwellnhof
  • 32,319
  • 7
  • 89
  • 113
Adi
  • 729
  • 2
  • 6
  • 13
  • This snippet generates output 8. – Adi Sep 09 '16 at 10:30
  • 7
    Dereferencing a NULL pointer is undefined behavior, undefined behavior is undefined. – πάντα ῥεῖ Sep 09 '16 at 10:31
  • 2
    this is basically how people implemented [offsetof](http://en.cppreference.com/w/cpp/types/offsetof) by hand. It uses undefined behavior, but most compilers just swallowed it. – PeterT Sep 09 '16 at 10:33
  • @Adi the value is never read, only the pointer to the value is ever read. – PeterT Sep 09 '16 at 10:35
  • @πάνταῥεῖ: As the macro is provided by the compiler, it can safely be assumed they know what they do (yes, I'm an optimist;-) and that should work. But it should not be be written in by user code. Actually, the `0` is used as address `0x0`, not a null pointer. It is one of the greatest nastinesses of C it cannot distinguish between them in source code. – too honest for this site Sep 09 '16 at 10:38
  • `(((foo *)0)->m)` represents an `int` at a non-accessible address. But this int is not read, but its address is taken. No 'bad' pointer is dereferenced. That's why it 'works', but it is UB. – alain Sep 09 '16 at 10:38
  • Thanks Alain..what is UB – Adi Sep 09 '16 at 10:41
  • @alain: Strictly speaking, it is indeed dereferenced. – too honest for this site Sep 09 '16 at 10:41
  • 2
    I do not think this is a duplicate. That other question was about what undefined behavior means. This one is about a specific case, and how the compiler can handle the code. Saying "undefined behavior is undefined" simply does not resolve the misunderstanding this poster has, and neither does the "duplicate". – Shachar Shemesh Sep 09 '16 at 10:41
  • @Adi UB is short for 'undefined behavior' – alain Sep 09 '16 at 10:45
  • @Olaf yes, in a language-lawyer sense it is dereferenced I guess. – alain Sep 09 '16 at 10:47
  • Note that things like `((foo *)0)->m` could be used as a hack to get the size of a struct member. `sizeof(((foo *)0)->m)` is actually well-defined, since the expression isn't evaluated for side effects (with the exception of VLAs). Maybe that's how it is used in the original code? – Lundin Sep 09 '16 at 12:47
  • Also please note that the macro such as it is used in this example is nonsense. To find a struct member offset, use available standard C features instead, namely the `offsetof` macro in stddef.h. Usage: `printf("\n %zu ", offsetof(foo, c));` – Lundin Sep 09 '16 at 12:53

1 Answers1

2

The C11 standard says in 6.5.3.2 Address and indirection operators:

The unary & operator yields the address of its operand. If the operand has type ''type'', the result has type ''pointer to type''. If the operand is the result of a unary * operator, neither that operator nor the & operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue. Similarly, if the operand is the result of a [] operator, neither the & operator nor the unary * that is implied by the [] is evaluated and the result is as if the & operator were removed and the [] operator were changed to a + operator. Otherwise, the result is a pointer to the object or function designated by its operand.

(Emphasis mine.) And the footnote says:

Thus, &*E is equivalent to E (even if E is a null pointer)

Interestingly, there's no exception for the -> operator. I don't know whether this was done deliberately or whether it is an oversight. So in a strict interpretation of the standard, I'd say that &(((foo *)0)->m) is undefined behavior. This doesn't mean that a program has to crash or that the compiler has to complain, though.

That said, it's completely reasonable to make the same exception when taking the address of the result of an -> operator, and that's what most compilers do.

nwellnhof
  • 32,319
  • 7
  • 89
  • 113
  • `p->m` is equivaent to `(*p).m`, but in the special exception the standard talks about, the operand of the `&`-operator is the result of the `*`-operator, whereas in `&(*p).m`, the operand of the `&` is the result of the `.`-operator. So `p->m` is undefined behavior for `p == NULL`. – EOF Sep 09 '16 at 12:02