5

The following code snippet is an example from the C11 standard §6.5.2.3:

struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2)
{
    if (p1->m < 0)
        p2->m = -p2->m;
    return p1->m;
}
int g()
{
    union {
        struct t1 s1;
        struct t2 s2;
    } u;
    /* ... */
    return f(&u.s1, &u.s2);
}

As per C11, the last line inside g() is invalid. Why so?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Vikas Yadav
  • 322
  • 2
  • 9
  • 2
    What's the actual error? – MrJLP Jul 26 '17 at 23:08
  • …and, wholly tangentially, I note that this fragment defines `g` as `int g()` rather than `int g(void)`. This should lay to rest the canard that `int main()` is not approved by the standard — there is no prototype for `g()`, but the function definition is valid. – Jonathan Leffler Jul 26 '17 at 23:30
  • 2
    I read the problem as function `g` is passing in both fields of the union to `f`. `f` is treating each parameter as an independent structure, except that they actually come from the same union and thus alias. – Brian Jul 26 '17 at 23:38

2 Answers2

6

The example comes from Example 3 in §6.5.2.3 Structure and union members of ISO/IEC 9899:2011. One of the prior paragraphs is (emphasis added):

¶6 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

The code quoted in the question is preceded by the comment:

The following is not a valid fragment (because the union type is not visible within function f).

This now makes sense in light of the highlighted statement. The code in g() is making use of the common initial sequence, but that only applies where the union is visible and it isn't visible in f().

The issue is also one of strict aliasing. That's a complex topic. See What is the strict aliasing rule? for the details.

For whatever it is worth, GCC 7.1.0 doesn't report the problem even under stringent warning options. Neither does Clang, even with the -Weverything option:

clang -O3 -g -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes \
    -Wstrict-prototypes -Weverything -pedantic …
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • I am still trying to understand it: `&u.s1` and `&u.s2` gets evaluated inside g (and g has visibility of union u) before calling f and then they get passed as a pair of input params (a pointer to struct t1 and a pointer to astruct t2) to f on stack. Why does f need to know the details of union u? – Vikas Yadav Jul 27 '17 at 20:43
  • Good question. I don't have a definitive answer — but it's roughly "because the standard says so". Two important C compilers don't think there's a problem, even when their arms are twisted hard to complain about anything and everything. I rechecked with `-fstrict-aliasing` specified too; no complaints. In my initial comment, I noted that examples are not normative — they can be removed without changing the meaning of the standard, and errors in examples are undesirable (and rare) but do not affect the validity of the standard (see [SO 21364398](https://stackoverflow.com/questions/21364398/)). – Jonathan Leffler Jul 27 '17 at 22:19
  • The sub-text to my comment is that there's a possibility that this example is erroneous, though it is pretty unlikely. You can find rules about aliasing in the C11 standard §6.5 **Expressions**, ¶7: _An object shall have its stored value accessed only by an lvalue expression that has one of the following types:_ (followed by a bullet list) with footnote 88 which says _The intent of this list is to specify those circumstances in which an object may or may not be aliased._ The list uses the term _effective type_ specified in ¶6 — see also §6.2.7 **Compatible type and composite type**. is also – Jonathan Leffler Jul 27 '17 at 22:23
  • This sort of dizzying spinning through different sections of the standard makes it hard to be sure what's going on at times. – Jonathan Leffler Jul 27 '17 at 22:27
  • Thanks @Jonathan-leffler for your guidance! – Vikas Yadav Jul 28 '17 at 08:15
  • Since the union is not visible to `f`, the compiler may assume when compiling `f` that `*p1` and `*p2` do not alias each other. – R.. GitHub STOP HELPING ICE Aug 14 '17 at 10:59
  • @R.. thanks for input, so this means possibility of missing a chance to optimize by compiler, but does that make it invalid as per the note Example 3 in §6.5.2.3 Structure and union members of ISO/IEC 9899:2011? I resolved it based on Jonathan Leffler's comment that it might be erroneous. – Vikas Yadav Aug 14 '17 at 22:42
  • @VikasYadav: You interpreted my comment backwards. I said the compiler *may assume*, (rather than *may not assume*) that they don't alias each other. Jonathan is wrong in thinking the example is erroneous. The compiler can clearly reorder (or otherwise optimize) the accesses to `*p1` and `*p2`, assuming they don't refer to the same object, thereby yielding a different result from what would necessarily be obtained if the union were visible to `f`. – R.. GitHub STOP HELPING ICE Aug 15 '17 at 00:03
  • Although I mention that examples in the standard are not normative, they are supposed to be illustrative and do not contain errors in the ordinary course of events. Despite any comments made previously (by me), I don't think there's likely to be an error in the standard (nor in this example in the standard). – Jonathan Leffler Jan 07 '18 at 16:45
  • @JonathanLeffler: The fact that some compilers "interpret" the Standard in a way which is incompatible with a large amount of code that would rely upon Common Initial Sequence guarantees in ways that were defined in C89 would seem prima facie evidence that the Standard is defective. The real problem, though, is that the Standard fails to attach any significance to the act of driving a pointer of one type from a pointer or lvalue of another. If a function receives a `struct t1*` and a `struct t2*` and there is no evidence within the function that the types might alias, it might be... – supercat Jan 25 '18 at 21:01
  • ...reasonable for a compiler to assume they won't even if they share a CIS in a visible declared union type. If, however, one function receives two pointers of type `struct t1*`, does some stuff with the first, casts the second one of them to `struct t2*` and passes it to another function, and then does some more stuff with the first, the fact that the cast and use of the second pointer appeared between the two groups of actions with s1 should be recognized as giving adequate notice that the second function might affect something of type s1. – supercat Jan 25 '18 at 21:05
  • BTW, unless things have changed very recently, gcc doesn't care about whether the complete union type is visible but ignores the CIS guarantee even when a complete union type is visible. – supercat Jan 29 '18 at 22:30
3

This is because of the "effective type" rule. If you see f isolated, the two arguments have different type, and the compiler is allowed to do certain optimizations.

Here, p1 is accessed twice. If p1 and p2 are supposed to be different, the compiler needs not to reload p1's value for the return since it cannot have changed.

f is valid code, and the optimization is valid.

Calling it with the same object, in g, is not valid, because without seeing that both may come from the same union the compiler may not take provisions to avoid the optimization.

This is one of the cases, where the whole burden to prove that a call is valid lays on the user of a function, generally no compiler can warn you about this if f and g happen to be in different translation units.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • The Effective Type rule offers heap-duration objects an example of objects which have no declared type, and does not so far as I can tell contain any rule for treating objects of static or automatic duration (which are required to have declared types) as though they have a declared type in some contexts but not others. Requiring that compilers treat such objects as though they might have a declared type the compiler can't see would in many cases impair optimization, but I know of nothing in the Standard that says compilers only have to follow the rules... – supercat Jan 29 '18 at 22:28
  • ...when convenient to do so. What am I missing? – supercat Jan 29 '18 at 22:29