18

Possible Duplicate:
Why does this C code work?
How do you use offsetof() on a struct?

I read about this offsetof macro on the Internet, but it doesn't explain what it is used for.

#define offsetof(a,b) ((int)(&(((a*)(0))->b)))

What is it trying to do and what is the advantage of using it?

Community
  • 1
  • 1
Samuel Liew
  • 76,741
  • 107
  • 159
  • 260
  • 2
    That `offsetof` macro is incorrect. They should cast to `size_t`, not `int`, and they should probably subtract `(char*)0` from the result before casting even though it's a null pointer constant. – Chris Lutz Oct 26 '11 at 02:27

4 Answers4

46

R.. is correct in his answer to the second part of your question: this code is not advised when using a modern C compiler.

But to answer the first part of your question, what this is actually doing is:

(
  (int)(         // 4.
    &( (         // 3.
      (a*)(0)    // 1.
     )->b )      // 2.
  )
)

Working from the inside out, this is ...

  1. Casting the value zero to the struct pointer type a*
  2. Getting the struct field b of this (illegally placed) struct object
  3. Getting the address of this b field
  4. Casting the address to an int

Conceptually this is placing a struct object at memory address zero and then finding out at what the address of a particular field is. This could allow you to figure out the offsets in memory of each field in a struct so you could write your own serializers and deserializers to convert structs to and from byte arrays.

Of course if you would actually dereference a zero pointer your program would crash, but actually everything happens in the compiler and no actual zero pointer is dereferenced at runtime.

In most of the original systems that C ran on the size of an int was 32 bits and was the same as a pointer, so this actually worked.

Eamonn O'Brien-Strain
  • 3,352
  • 1
  • 23
  • 33
  • 4
    Excellent! Thank you. The key to me was `placing a struct object at memory address zero and then finding out at what the address of a particular field is`. – Timur Fayzrakhmanov Oct 17 '18 at 10:43
18

It has no advantages and should not be used, since it invokes undefined behavior (and uses the wrong type - int instead of size_t).

The C standard defines an offsetof macro in stddef.h which actually works, for cases where you need the offset of an element in a structure, such as:

#include <stddef.h>

struct foo {
    int a;
    int b;
    char *c;
};

struct struct_desc {
    const char *name;
    int type;
    size_t off;
};

static const struct struct_desc foo_desc[] = {
    { "a", INT, offsetof(struct foo, a) },
    { "b", INT, offsetof(struct foo, b) },
    { "c", CHARPTR, offsetof(struct foo, c) },
};

which would let you programmatically fill the fields of a struct foo by name, e.g. when reading a JSON file.

Nisse Engström
  • 4,738
  • 23
  • 27
  • 42
R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • I am sorry - how does the offsetof macro cause undefined behavior especially since its was defined in the C standard? – Adrian Cornish Oct 26 '11 at 02:21
  • 5
    The standard `offsetof` macro from `stddef.h` does not invoke UB. Defining your own hack to compute offsets this way does invoke UB. – R.. GitHub STOP HELPING ICE Oct 26 '11 at 02:24
  • Please quote me the standard reference that says defining your own version of the macro causes undefined behaviour – Adrian Cornish Oct 26 '11 at 02:27
  • 2
    @Adrian: He didn't say, "defining your own version of the macro causes undefined behaviour." He specifically said, "Defining your own hack to compute offsets **this way** does invoke UB." In the code, at this point: `((a*)(0))->` you've invoked undefined behavior by dereferencing null. – GManNickG Oct 26 '11 at 02:29
  • 1
    @GMan - where the hell have you referenced null its casting null as a a* pointer. And "Defining your own hack" is that a technical term for some code that I dont know after 20 years of C programming? let see how linux defines it #ifndef offsetof # define offsetof(T,F) ((unsigned int)((char *)&((T *)0L)->F - (char *)0L)) #endif Hmm look very similar to OP – Adrian Cornish Oct 26 '11 at 02:36
  • @Adrian: `x->` is defined to be `(*x).`. In our case `x` is `(a*)0`, and `*x` dereferences null. And congrats: after twenty years of C you still don't know what implementation-defined behavior is. Quoting a specific definition on a specific implementation at a specific time has nothing to do with the language definition of the macro. The language states the *effects* of the macro and that's it, it doesn't define an implementation. I mean hell, you quoted the standard yourself; where in there does is state the definition of the macro? – GManNickG Oct 26 '11 at 02:38
  • 6
    @AdrianCornish: **The implementation** is allowed to define `offsetof` however it likes as long as it implements the correct behavior. **Your application** does not have this privilege because it can't define the behavior of anything; it can only use already-defined language constructs. That's how C works. – R.. GitHub STOP HELPING ICE Oct 26 '11 at 02:42
  • 5
    6.5.2.3 does not use the word "dereference", but specifies it as "the named member of the object to which the first expression points". Since `(a*)(0)` does not point to an object of type `a`, the behavior is undefined (by virtue of not being defined). – R.. GitHub STOP HELPING ICE Oct 26 '11 at 02:49
  • [I love language standard debates] Which subclause (are you using c99) I cannot find one. I would argue: 6.3.2.3 Pointers 1 A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer. – Adrian Cornish Oct 26 '11 at 02:54
  • 6
    The text you cited is irrelevant. No pointer to incomplete or object type is converted to a pointer to void in the bogus macro. – R.. GitHub STOP HELPING ICE Oct 26 '11 at 02:57
  • 1
    @R.. what is the type of `foo_desc` here? Did you mean `food_desc[]` instead? – ajay Mar 15 '14 at 10:03
  • I added the `include` directive because it just felt *right* after [this edit suggestion](http://stackoverflow.com/review/suggested-edits/14623034) got rejected. – Nisse Engström Dec 17 '16 at 10:17
6

It's finding the byte offset of a particular member of a struct. For example, if you had the following structure:

struct MyStruct
{
    double d;
    int i;
    void *p;
};

Then you'd have offsetOf(MyStruct, d) == 0, offsetOf(MyStruct, i) == 8, and offsetOf(MyStruct, p) == 12 (that is, the member named d is 0 bytes from the start of the structure, etc.).

The way that it works is it pretends that an instance of your structure exists at address 0 (the ((a*)(0)) part), and then it takes the address of the intended structure member and casts it to an integer. Although dereferencing an object at address 0 would ordinarily be an error, it's ok to take the address because the address-of operator & and the member dereference -> cancel each other out.

It's typically used for generalized serialization frameworks. If you have code for converting between some kind of wire data (e.g. bytes in a file or from the network) and in-memory data structures, it's often convenient to create a mapping from member name to member offset, so that you can serialize or deserialize values in a generic manner.

Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
-3

The implementation of the offsetof macro is really irrelevant.

The actual C standard defines it as in 7.17.3:

offsetof(type, member-designator)

which expands to an integer constant expression that has type size_t, the value of which is the offset in bytes, to the structure member (designated by member-designator), from the beginning of its structure (designated by type). The type and member designator shall be such that given static type t;.

Trust Adam Rosenfield's answer.

R is completely wrong, and it has many uses - especially being able to tell when code is non-portable among platforms.

(OK, it's C++, but we use it in static template compile time assertions to make sure our data structures do not change size between platforms/versions.)

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Adrian Cornish
  • 23,227
  • 13
  • 61
  • 77