Use a base struct to extract values from void*

Question

I have many types of structs in my project, and another struct that holds a pointer to one of these structs. Such as,

struct one{int num = 1;};
struct two{int num = 2;};
struct three{int num = 3;};

// These structs hold many other values as well, but the first value is always `int num`.

And I have another struct that holds references to these structs. I had to use void* because I do not know which of these structs is going to be referenced.

struct Holder{void* any_struct};

My question is, I need the values inside these structs, but I have a void pointer, could I declare a base struct that the first variable is an int, cast it, and use it to extract the num variable from these structs, such as:

struct Base{int num};
((Base*) Holder->any_struct)->num
// Gives 1, 2 or 3

The short answer is, "Yes". This is sometimes known as "Poor man's inheritance". You're simulating the base class / derived class relationship that's a formal part of OOP languages like C++ — Steve Summit, Oct 31 '19 at 10:55
Also, you could make your intentions clearer by using `struct Base *` all the time, instead of `void *`. — Steve Summit, Oct 31 '19 at 10:56
Seems like you really want `struct Holder{void* any_struct; enum type {ONE, TWO, THREE} t; };` — William Pursell, Oct 31 '19 at 11:00
@WilliamPursell Actually the value is not an integer, but another pointer. But I need it to have the information about how the parse that value inside the pointer. I just used the integer to simulate my problem. However, thanks for the advice. — Max Paython, Oct 31 '19 at 11:03
Take a look at bsd sockets with it's `struct sockaddr`. Although it's possible and every linux today uses sockets, in the end i believe it's a bad design and composition+containerof is better. — KamilCuk, Oct 31 '19 at 11:05
The fact that your underlying struct contains an int is not relevant to my comment. I'm merely adding an element to struct Holder that is used as metadata to know what type `any_struct` references. — William Pursell, Oct 31 '19 at 11:06

EylM · Answer 1 · 2019-10-31T11:04:30.250

4

If the only thing you need is to extract the num, you can use memcpy. Assuming it's always int and always first and always present.

int num = 0;
memcpy(&num, Holder->any_struct, sizeof(int));
// Gives 1, 2 or 3 in num.

C99 standard section 6.7.2.1 bullet point 13:

A pointer to a structure object, suitably converted, points to its initial member. There may be unnamed padding within a structure object, but not at its beginning.

More info about the standard in this answer.

edited Oct 31 '19 at 11:04

answered Oct 31 '19 at 10:54

EylM

5,967
2
16
28

I didn't want to use memcpy as a first idea, because if C gave the struct some metadata that gives an offset to the first variable. But I guess this is not the case. – Max Paython Oct 31 '19 at 10:58

Darren Smith · Answer 2 · 2019-10-31T11:00:32.050

3

I think this is acceptable, and I've seen this pattern in other C projects. E.g., in libuv. They define a type uv_handle_t and refer to it as a "Base handle" ... here's info from their page (http://docs.libuv.org/en/v1.x/handle.html)

uv_handle_t is the base type for all libuv handle types.

Structures are aligned so that any libuv handle can be cast to uv_handle_t. All API functions defined here work with any handle type.

And how they implement is pattern you could adopt. They define a macro for the common fields:

#define UV_HANDLE_FIELDS                                                      \
  /* public */                                                                \
  void* data;                                                                 \
  /* read-only */                                                             \
  uv_loop_t* loop;                                                            \
  uv_handle_type type;                                                        \
  /* private */                                                               \
  uv_close_cb close_cb;                                                       \
  void* handle_queue[2];                                                      \
  union {                                                                     \
    int fd;                                                                   \
    void* reserved[4];                                                        \
  } u;                                                                        \
  UV_HANDLE_PRIVATE_FIELDS                                                    \

/* The abstract base class of all handles. */
struct uv_handle_s {
  UV_HANDLE_FIELDS
};

... and then they use this macro to define "derived" types:

/*
 * uv_stream_t is a subclass of uv_handle_t.
 *
 * uv_stream is an abstract class.
 *
 * uv_stream_t is the parent class of uv_tcp_t, uv_pipe_t and uv_tty_t.
 */
struct uv_stream_s {
  UV_HANDLE_FIELDS
  UV_STREAM_FIELDS
};

The advantage of this approach is that you can add fields to the "base" class by updating this macro, and then be sure that all "derived" classes get the new fields.

edited Oct 31 '19 at 11:00

answered Oct 31 '19 at 10:54

Darren Smith

2,261
16
16

Sorry, but no, this is nowhere near acceptable practice and not something you should be teaching. The correct way to implement this would be with opaque types and an incomplete type struct declaration in a header. – Lundin Oct 31 '19 at 11:37
@Lundin Using the preprocessor is pretty ugly (and, Darren, I think your examples are missing some backslashes or something), but the underlying "poor man's inheritance" technique is sound, if somewhat old-school. – Steve Summit Oct 31 '19 at 11:54
@SteveSummit It's not old-school, it's old-garage. Incomplete type has been around at least since C90. There was never a reason to write macros like these. How to do it right: https://stackoverflow.com/a/13032531/584518 – Lundin Oct 31 '19 at 12:04
Steve: I copied the code directly from libuv source code, so I expect it to be all okay. @Lundin. Why do you think this approach is so unacceptable? – Darren Smith Oct 31 '19 at 12:04
@DarrenSmith Because it just adds needless complexity plus all the problems with macros. Just because you find it in some lib doesn't make it good. I posted a link to a saner alternative in my comment above. – Lundin Oct 31 '19 at 12:06
@Lundin I think we're talking about two different things. It looks like you're talking about encapsulation and data hiding. I'm talking about (and the aspect of the original question I'm responding to concerns) inheritance. The OP claims his structs `one`, `two`, and `three` all contain various additional members. – Steve Summit Oct 31 '19 at 12:08
@SteveSummit Yes, encapsulation isn't very relevant to the question. But it has to be handled when you implement inheritance. – Lundin Oct 31 '19 at 12:10
@Lundin agree with your concerns with macro, _can_ be abused and become quickly out of control. But this macro approach does have advantages over the struct approach. The code contains less types (specifically these somewhat artificial sub-struct types), and usage of the aggregates is more clear, e.g. access is written like `handle->type` instead of `handle->base.type`. – Darren Smith Oct 31 '19 at 13:05
More types = stricter compiler checks = less bugs. Though if you are concerned about writing an extra member, then you can but the `base` object in an anonymous union together with an `int num`. – Lundin Oct 31 '19 at 13:28
@DarrenSmith *I copied the code directly from libuv source code, so I expect it to be all okay* [Yeah, no](https://github.com/libuv/libuv/issues/708). The authors seem to have problems just cleanly writing 64-bit code - the comments are problematic. – Andrew Henle Oct 31 '19 at 14:16

score 2 · Answer 3 · answered Oct 31 '19 at 11:56

First of all, the various rules of type conversions between different struct types in C are complex and not something one should meddle with unless one knows the rules of what makes two structs compatible, the strict aliasing rule, alignment issues and so on.

That being said, the simplest kind of base class interface is similar to what you have:

typedef struct
{
  int num;
} base_t;

typedef struct 
{
  base_t base;
  /* struct-specific stuff here */
} one_t;

one_t one = ...;
...
base_t* ref = (base_t*)&one;
ref->num = 0; // this is well-defined

In this code, the base_t* doesn't point directly at num but at the first object in the struct which is of base_t. It is fine to de-reference it because of that.

However, your original code with the int num spread over 3 structs doesn't necessarily allow you to cast from one struct type to another, even if you only access the initial member num. There's various details regarding strict aliasing and compatible types that may cause problems.

score 0 · Answer 4 · answered Oct 31 '19 at 12:34

The construct you describe of using a pointer to a "base" structure as an alias to several "derived" structures, while often used with things like struct sockaddr, is not guaranteed to work by the C standard.

While there is some language to suggest is might be supported, particularly 6.7.2.1p15:

Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

Other parts suggest it is not, particularly 6.3.2.3 which discusses pointer conversions that are allowed:

1 A pointer to void may be converted to or from a pointer to any object type. A pointer toany object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

2 For any qualifier q, a pointer to a non-q-qualified type may be converted to a pointer to the q-qualified version of the type; the values stored in the original and converted pointers shall compare equal.

3 An integer constant expression with the value 0, or such an expression cast to type void *, is called anull pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

4 Conversion of a null pointer to another pointer type yields a null pointer of that type.Any two null pointers shall compare equal.

5 An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

6 Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type,the behavior is undefined. The result need not be in the range of values of any integer type.

7 A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.
Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type,the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

8 A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type,the behavior is undefined.

From the above is does not state that casting from one struct to another where the type of the first member is the same is allowed.

What is allowed however is making use of a union to do essentially the same thing. Section 6.5.2.3p6 states:

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

So what you can do is define a union that contains all the possible types as well as the base type:

union various {
    struct base { int num; } b;
    struct one { int num; int a; } s1;
    struct two { int num; double b; } s2;
    struct three { int num; char *c; } s3;
};

Then you use this union anyplace you need of of the three subtypes, and you can freely inspect the base member to determine the type. For example:

void foo(union various *u)
{
    switch (u->b.num) {
    case 1:
        printf("s1.a=%d\n", u->s1.a);
        break;
    case 2:
        printf("s2.b=%f\n", u->s2.b);
        break;
    case 1:
        printf("s3.c=%s\n", u->s3.c);
        break;
    }
}

...

union various u;
u.s1.num = 1;
u.s1.a = 4;
foo(&u);
u.s2.num = 2;
u.s2.b = 2.5;
foo(&u);
u.s3.num = 3;
u.s3.c = "hello";
foo(&u);

I purposely avoided recommending "common initial sequence" trick because it's apparently a quality of implementation feature. [See this](https://stackoverflow.com/questions/34616086/union-punning-structs-w-common-initial-sequence-why-does-c-99-but-not). Not sure of its current status, though? — Lundin, Oct 31 '19 at 13:35
@Lundin This seems to only come into play where you're passing around a pointer to one of the structs in a place where the union definition isn't visible (see 6.5.2.3p9). As long as the union is globally visible and it's used anyplace one of the structs might be used there's no issue. — dbush, Oct 31 '19 at 13:39

Use a base struct to extract values from void*

4 Answers4