But I got the correct return value when I run this code.
A given compiler accepting the code and the compiled program exhibiting the expected observable behavior are not reliable indicators of the code being correct with respect to the language specification. Nor, therefore, are they reliable predictors of whether a different compiler will accept the code or produce a program that exhibits the expected observable behavior.
Conforming compilers emit diagnostics about certain classes of errors, but they are not required to diagnose all errors, and there are some classes of errors that could not be detected at compile time even if the compiler wanted to do. Generally speaking, compilation and / or execution of incorrect code produces undefined behavior, which many times does not involve any error messages, and which sometimes is even the behavior that was expected. Bottom line: that a program produces the correct or expected result does not prove that the program is correct.
As for the main question,
what makes me confused is the following statements, because the standard said it is not a valid fragment (because the union type is not visible within function f).
[...]
So my question is: Is that valid [passing] a pointer of union's member which is a structure defined in file scope to another function where union type is not visible to?
You have misunderstood the nature of the problem. It is not inherent in passing a pointer to a member of a union. Rather, it has to do with accessing more than one member of the same union object.
The general rule arises from
The value of at most one of the
members can be stored in a union object at any time
(C17 6.7.2.1/16)
and the so-called strict aliasing rule, paragraph 6.5/7:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
[a type compatible with the objects effective type, plus / minus qualification, or the corresponding signed / unsigned type of one of the above, or]
- an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
- a character type.
The storage for every member of a given union object overlaps the storage of all the others (which is why the union can hold only one at a time), therefore with ...
union U{
struct t1 s1;
struct t2 s2;
} u = {-1};
... &u.s1
and &u.s2
point to the same storage. With the given initialization, its effective type is struct s1
, and it is the initial part of a possibly larger block of storage whose effective type is union U
.
Structure types with different tags are never compatible with each other, so the strict aliasing rule would be violated by g()
accessing that initial value via the pointer &u.s2
, except that the specification carves out a special case:
One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible.
(C17 6.5.2.3/6)
This is precisely what the example you're looking at is about. Because struct t1
and struct t2
have a common initial sequence consisting of their respective members m
, and because the union object in function g()
initially does contain a value for its member s1
, of type struct t1
, it is permitted in g()
to access u.s2.m
, including indirectly via &u.s2
, even though u.s2
is not the member that currently contains a value.
However, 6.5.2.3/6 does not apply in function f()
, because type union U
is not visible there. Therefore, although it's fine for f()
to access p1->m
, it produces UB for it to attempt to access p2->m
. This is the claim you inquired about.