3

I'm designing an application and came across an implementation issue. I have the following struct definition:

app.h:

struct application_t{
    void (*run_application)(struct application_t*);
    void (*stop_application)(struct application_t*);
}

struct application_t* create();

The problem came when I tried to "implement" this application_t. I tend to define another struct:

app.c:

struct tcp_application_impl_t{
    void (*run_application)(struct application_t*);
    void (*stop_application)(struct application_t*);
    int client_fd;
    int socket_fd;
}

struct application_t* create(){
     struct tcp_application_impl_t * app_ptr = malloc(sizeof(struct tcp_application_impl_t));
     //do init
     return (struct application_t*) app_ptr;
}

So if I use this as follows:

#include "app.h"

int main(){
    struct application_t *app_ptr = create();
    (app_ptr -> run_application)(app_ptr);    //Is this behavior well-defined?
    (app_ptr -> stop_application)(app_ptr);   //Is this behavior well-defined?
}

The problem confusing me is if I this calling to (app_ptr -> run_application)(app_ptr); yeilds UB.

The "static type" of app_ptr if struct application_t*, but the "dynamic type" is struct tcp_application_impl_t*. The struct application_t and struct tcp_application_t are not compatible by N1570 6.2.7(p1):

there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types

which obviously is not true in this case.

Can you please provide a reference to the Standard explaining the behavior?

Some Name
  • 8,555
  • 5
  • 27
  • 77
  • 1
    looks like a "inheritance in C" hack... maybe defining the first field of your struct as a `application_t` type would be better – Jean-François Fabre Feb 07 '19 at 09:36
  • 3
    you must use `struct tcp_application_impl_t { application_t base; int client_fd, socket_fd; }` and here if `cp_application_impl_t *p` then `&p->base == p` on binary level – RbMm Feb 07 '19 at 09:36
  • @Jean-FrançoisFabre Sort of this, yes. I'm actually not sure if it is a "common C way" to do things like this... – Some Name Feb 07 '19 at 09:37
  • this can help: https://stackoverflow.com/a/1237302/6451573 – Jean-François Fabre Feb 07 '19 at 09:41
  • @RbMm So I guess in the example I provided the behavior is undefined. – Some Name Feb 07 '19 at 09:43
  • 1
    i think you need `&app_ptr->base;` in `create()` and use `void run_application(application_t* p) { tcp_application_impl_t*q = CONTAINING_RECORD(p, tcp_application_impl_t, base); ...}` where `#define CONTAINING_RECORD(address, type, field) ((type *)( (PCHAR)(address) - (ULONG_PTR)(&((type *)0)->field)))` – RbMm Feb 07 '19 at 09:48
  • 1
    The behaviour, as you have correctly noted, is undefined. The standard doesn't explain undefined behaviour. – n. m. could be an AI Feb 07 '19 at 10:21
  • @n.m. Yes, the Standard explicitly states that if it some behavior is not specified in the Standard explicitly it is undefined. Essentially I was trying to find possible ways to access struct members. – Some Name Feb 07 '19 at 15:47

2 Answers2

2

Your two structs aren't compatible since they are different types. You have already found the chapter "compatible types" that defines what makes two structs compatible. The UB comes later when you access these structs with a pointer to the wrong type, strict aliasing violation as per 6.5/7.

The obvious way to solve this would have been this:

struct tcp_application_impl_t{
    struct application_t app;
    int client_fd;
    int socket_fd;
}

Now the types may alias, since tcp_application_impl_t is an aggregate containing a application_t among its members.

An alternative to make this well-defined, is to use a sneaky special rule of "union common initial sequence", found hidden in C17 6.5.2.3/6:

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

This would allow you to use your original types as you declared them. But somewhere in the same translation unit, you will have to add a dummy union typedef to utilize the above rule:

typedef union
{
  struct application_t app;
  struct tcp_application_impl_t impl;
} initial_sequence_t;

You don't need to actually use any instance of this union, it just needs to sit there visible. This tells the compiler that these two types are allowed to alias, as far as their common initial sequence goes. In your case, it means the function pointers but not the trailing variables in tcp_application_impl_t.

Edit:

Disclaimer. The common initial sequence trick is apparently a bit controversial, with compilers doing other things with it than the committee intended. And possibly works differently in C and C++. See union 'punning' structs w/ "common initial sequence": Why does C (99+), but not C++, stipulate a 'visible declaration of the union type'?

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Cool! Are you sure that the "inspection" the spec talks about doesn't have to go through an instance of the `union` type? – unwind Feb 07 '19 at 10:21
  • @unwind Fairly sure since this trick is used in production code here and there. I prefer the former version myself, where an inherited struct contains an instance of its base class. – Lundin Feb 07 '19 at 10:23
  • @unwind Actually it seems like a controversial "hot potato" yielding various compiler defect reports. See https://stackoverflow.com/questions/34616086/union-punning-structs-w-common-initial-sequence-why-does-c-99-but-not. Hmm. It was never changed in C17 even though there are apparently some DR. – Lundin Feb 07 '19 at 10:30
  • @Lundin Applying that `6.2.5(p2)`: `All pointers to structure types shall have the same representation and alignment requirements as each other` implies that all pointer types to all structures are compatible with each other. Applying `6.7.2.1(p15)`: `A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa.` in case the `tcp_application_impl_t` contains `application_t` as its first member we can simply cast `tcp_application_impl_t` to `application_t` and back. Is it correct? – Some Name Feb 08 '19 at 09:17
  • 1
    @SomeName You can go from `tcp_application_impl_t` to `application_t` safely. It is supported by 6.7.2.1 and 6.5/7 (strict aliasing) both. What you can't do, and this probably goes without saying, is to allocate an `application_t`, cast a pointer to `tcp_application_impl_t`, then access the additional members that aren't actually there. – Lundin Feb 08 '19 at 10:19
  • @Lundin: For whatever reason, nobody involved with the gcc, clang, or the Standard seems able to recognize that the type-aliasing rules are only meant to apply in cases that involve aliasing (*overlapped* creation and use of references to objects), despite the fact that Footnote 88 of the C11 draft says *exactly that*. Given `struct foo *p; struct bar *q;` it would be reasonable to assume that a write to `p->x` won't interfere with an access to `q->x` *if, in the context where the accesses occur*, there's no visible means by which `p` could have been formed from a `struct bar*`, nor... – supercat Feb 11 '19 at 15:24
  • ...`q` from a `struct foo*`, and `p` and `q` are not used at times when they are freshly derived from some common parent object (deriving `p` would cause `q` to become stale, and vice versa). Most code that requires `-fno-strict-aliasing` does not, in fact, actually involve aliasing of objects of different types, but merely the formation of localized temporary relationships that would be visible to any compiler that made any effort to look for them rather than using the rules as an excuse to ignore them. – supercat Feb 11 '19 at 15:30
  • @supercat The common initial sequence trick originates from a need to document to the compiler that two objects may alias. But apparently various compilers treat this in unexpected ways. I have never used it myself, but I've seen others using it in production code, as a "standardized -fno-strict-aliasing". – Lundin Feb 11 '19 at 15:54
  • @Lundin: Neither gcc nor clang will reliably process `member1Type *p1 = &unionArray[i].member1; p1->x = 0; member2Type *p2 = &unionArray[j].member2; p2->x = 1;member1Type *p3 = &unionArray[i].member2; return p3->x;` even though the compiler can see the use of the union object to form the pointers to the member types. If the Standard is going to accept the clang/gcc's interpretation as legitimate, it should make support for `&someUnion.member` optional, and at minimum recommend that implementations that can't handle the semantics reject the syntax. – supercat Feb 11 '19 at 15:59
  • Will disabling strict aliasing in the compiler make the (original) code work? – Poscat Nov 29 '22 at 16:50
1

If the "strict aliasing rule" (N1570 6.5p7) is interpreted merely as specifying the circumstances under which things may alias (which would seem to be what the authors intended, given Footnote 88, which says "The intent of this list is to specify those circumstances in which an object may or may not be aliased") code like yours should pose no problem provided that in all contexts where an object is accessed using lvalues of two different types, one of the involved lvalues is visibly freshly derived from the other.

The only way 6.5p7 can make any sense is if operations involving objects that are freshly visibly derived from other objects are recognized as operations on the originals. The question of when to recognize such derivation is left as a quality-of-implementation issue, however, and thought the marketplace would be better able to judge than the Committee what was necessary for something to be a "quality" implementation suitable for some particular purpose.

If the goal is to write code that will work on implementations which are configured to honor the clear intention of footnote 88, one should be safe provided that objects don't alias. Upholding this requirement may require that one ensure that the compiler can see either that pointers are related to each other, or that they are each freshly derived from a common object at point of use. Given, e.g.

thing1 *p1 = unionArray[i].member1;
int v1 = p1->x;
thing2 *p2 = unionArray[j].member2;
p2->x = 31;
thing1 *p3 = unionArray[i].member1;
int v2 = p3->x;

each pointer would be used in a context where it was freshly derived from unionArray, and thus there would be no aliasing even if i==j. A compiler like "icc" will have no problems with such code, even with -fstrict-aliasing enabled, but because both gcc and clang impose the requirements of 6.5p7 upon programmers even in cases not involving aliasing, they will not process it correctly.

Note that if the code had been:

thing1 *p1 = unionArray[i].member1;
int v1 = p1->x;
thing2 *p2 = unionArray[j].member2;
p2->x = 31;
int v2 = p1->x;

then the second use of p1 would alias p2 in cases where i==j because p2 would access the storage associated with p1, via means not involving p1, between the time p1 is formed and the last time it is used (thus aliasing p1).

According to the authors of the Standard, the Spirit of C includes the principles "Trust the programmer" and "Don't prevent the programmer from doing what needs to be done". Unless there is a particular need to cope with the limitations of an implementation that is not particularly well suited to what one is doing, one should target implementations that uphold the Spirit of C in a fashion appropriate to one's purposes. The -fstrict-aliasing dialect processed by icc, or the -fno-strict-aliasing dialects processed by icc, gcc, and clang, should be suitable for your purposes. The -fstrict-aliasing dialects of gcc and clang should be recognized as simply unsuitable for your purposes, and not worth targeting.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • If I understand you correctly `-fstrict-aliasing` should be unsuitable. But in case of declarsing the `struct application_t` to be the first member we can cast the two pointers to each other safely since pointer to a struct can be converted to the pointer of its first element if aligned correctly (struct pointers have the same alignment requirement so they are compatible). – Some Name Feb 10 '19 at 21:13
  • 1
    @SomeName: The cast may be safe, but gcc and clang interpret the Strict Alaising rule as forbidding almost any construct which would require taking the address of a union object, or using pointer casts to achieve similar semantics. – supercat Feb 10 '19 at 21:43