4

I'm writing a C library that uses some simple object-oriented inheritance much like this:

struct Base {
    int x;
};

struct Derived {
    struct Base base;
    int y;
};

And now I want to pass a Derived* to a function that takes a Base* much like this:

int getx(struct Base *arg) {
    return arg->x;
};

int main() {
    struct Derived d;
    return getx(&d);
};

This works, and is typesafe of course, but the compiler doesn't know this. Is there a way to tell the compiler that this is typesafe? I'm focusing just on GCC and clang here so compiler-specific answers are welcome. I have vague memories of seeing some code that did this using __attribute__((inherits(Base)) or something of the sort but my memory could be lying.

Shum
  • 1,236
  • 9
  • 22
  • That may be a stupid question, but why are you not using C++ if you actually want C++? Instead of trying to hack C into doing something similar, that is. It's not like the very same compiler couldn't do both languages with a switch flip. – Damon Jan 17 '14 at 11:37
  • 1
    @Damon Maybe he doesn't have C++ compiler available. And if he is working with the embedded devices, changing the compiler might be far from trivial. – user694733 Jan 17 '14 at 11:40
  • But both GCC and Clang _are_ C/C++ compilers. They will do either thing, depending on what language standard you give on the commandline (or simply by the source file's name, if you don't tell the compiler anything else). – Damon Jan 17 '14 at 11:42
  • 2
    @Damon There still might be other reasons. Company rules or compability with other tools. – user694733 Jan 17 '14 at 11:44
  • 1
    @Damon: I'm adding to a larger, preexisting, C codebase. I'd prefer to keep it all C. – Shum Jan 17 '14 at 12:44

3 Answers3

5

This is safe in C except that you should cast the argument to Base *. The rule that prohibits aliasing (or, more precisely, that excludes it from being supported in standard C) is in C 2011 6.5, where paragraph 7 states:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,

This rule prevents us from taking a pointer to float, converting it to pointer to int, and dereferencing the pointer to int to access the float as an int. (More precisely, it does not prevent us from trying, but it makes the behavior undefined.)

It might seems that your code violates this since it accesses a Derived object using a Base lvalue. However, converting a pointer to Derived to a pointer to Base is supported by C 2011 6.7.2.1 paragraph 15 states:

… A pointer to a structure object, suitably converted, points to its initial member…

So, when we convert the pointer to Derived to a pointer to Base, what we actually have is not a pointer to the Derived object using a different type than it is (which is prohibited) but a pointer to the first member of the Derived object using its actual type, Base, which is perfectly fine.

About the edit: Originally I stated function arguments would be converted to the parameter types. However, C 6.5.2.2 2 requires that each argument have a type that may be assigned to an object with the type of its corresponding parameter (with any qualifications like const removed), and 6.5.16.1 requires that, when assigning one pointer to another, they have compatible types (or meet other conditions not applicable here). Thus, passing a pointer to Derived to a function that takes a pointer to Base violates standard C constraints. However, if you perform the conversion yourself, it is legal. If desired, the conversion could be built into a preprocessor macro that calls the function, so that the code still looks like a simple function call.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • But "The *effective type* of an object ... is the declared type of the object, if any" and the declared type is `Derived`. Also GCC fails to compile this, because `Base*` and `Derived*` are not compatible pointer types. Same for Clang. – Fred Foo Jan 17 '14 at 13:31
  • And I interpret "suitably converted" as "converted to `char*`, `unsigned char*` or `void*`" -- but maybe I'm wrong. – Fred Foo Jan 17 '14 at 13:32
  • @larsmans: No, the declared type is not `Derived`, because the converted pointer is not a pointer to the `Derived` struct, it is a pointer to the first member, and the type of the first member is `Base`. The converted pointer has the type “pointer to `Base`”, and it points to a `Base` object. – Eric Postpischil Jan 17 '14 at 13:41
  • @larsmans: Why do you claim that GCC does not compile this? I took the source for the question, removed the improper semicolons following the function definitions, and compiled it with Apple GNU C 4.2.1 using `cc -Wmost -pedantic -O3 -g -std=c99`, and it compiled. There was a warning about the type of the argument to `getx` but no error. – Eric Postpischil Jan 17 '14 at 13:43
  • @larsmans: “Suitably converted” essentially means converted to a pointer to the type of the member. When the C standard says something “points to an *x*”, it means it is a pointer to the type of *x* and points to an *x* object. When it wants to say that a pointer to a character type points to the first **byte** of an object, it says that the pointer points to the byte, not to the object. – Eric Postpischil Jan 17 '14 at 13:46
  • @larsmans: Using `-Werrror` alters the behavior of the compiler to prohibit some conforming programs; it is not a valid test of whether a program conforms to the C standard. – Eric Postpischil Jan 17 '14 at 13:46
  • Actually the `memcpy` spec speaks of "the object pointed to by `s2`" and similarly for `s1`, even though both are `void` pointers. My reading of the Standard on this seems to be the same as that of GCC hacker [Ian Lance Taylor](http://gcc.gnu.org/ml/gcc-help/2005-10/msg00053.html). – Fred Foo Jan 17 '14 at 13:59
  • @larsmans: Thinking about it, your interpretation of “suitably converted” is completely nonsensical. It asserts that if we convert a pointer to `Derived` to the type “pointer to `Base`”, this is not suitably converted to be a pointer to `Base`. Obviously, the most suitable type for a pointer to `Base` is “pointer to `Base`”. – Eric Postpischil Jan 17 '14 at 14:04
  • I've read up some more and I believe you're right. It would have been nice if the standard had an example of this. +1. – Fred Foo Jan 17 '14 at 14:30
1

Give address of a base member (truly type-safe option):

getx(&d.base);

Or use void pointer:

int getx(void * arg) {
    struct Base * temp = arg;
    return temp->x;
};

int main() {
    struct Derived d;
    return getx(&d);
};

It works because C requires that there is never a padding before the first struct member. This won't increase type safety, but removes the needs for casting.

user694733
  • 15,208
  • 2
  • 42
  • 68
1

As noted above by user694733, you are probably best off to conform to standards and type safety by using the address of the base field as in (repeating for future reference)

struct Base{
   int x;
}
struct Derived{
   int y;
   struct Base b; /* look mam, not the first field! */
}

struct Derived d = {0}, *pd = &d;
void getx (struct Base* b);

and now despite the base not being the first field you can still do

getx (&d.b); 

or if you are dealing with a pointer

getx(&pd->b). 

This is a very common idiom. You have to be careful if the pointer is NULL, however, because the &pd->b just does

(struct Base*)((char*)pd + offsetof(struct Derived, b)) 

so &((Derived*)NULL)->b becomes

((struct Base*)offsetof(struct Derived, b)) != NULL.

IMO it is a missed opportunity that C has adopted anonymous structs but not adopted the plan9 anonymous struct model which is

struct Derived{
    int y;
    struct Base; /* look mam, no fieldname */
} d;

It allows you to just write getx(&d) and the compiler will adjust the Derived pointer to a base pointer i.e. it means exactly the same as getx(&d.b) in the example above. In other words it effectively gives you inheritance but with a very concrete memory layout model. In particular, if you insist on not embedding (== inheriting) the base struct at the top, you have to deal with NULL yourself. As you expect from inheritance it works recursively so for

struct TwiceDerived{
    struct Derived;
    int z;
} td;

you can still write getx(&td). Moreover, you may not need the getx as you can write d.x (or td.x or pd->x).

Finally using the typeof gcc extension you can write a little macro for downcasting (i.e. casting to a more derived struct)

#define TO(T,p) \
({ \
    typeof(p) nil = (T*)0; \
    (T*)((char*)p - ((char*)nil - (char*)0)); \
}) \

so you can do things like

struct Base b = {0}, *pb = &b;
struct Derived* pd = TO(struct Derived, pb); 

which is useful if you try to do virtual functions with function pointers.

On gcc you can use/experiment with the plan 9 extensions with -fplan9-extensions. Unfortunately it does not seem to have been implemented on clang.