Type punning is considered UB because the authors of the Standard expected that quality implementations intended for various purposes would behave "in a documented manner characteristic of the environment" in cases where the Standard imposed no requirements, but where such behavior would serve the intended purposes. As such, it was more important to avoid imposing overly strong mandates on implementations than to require that they support everything programmers would need.
To adapt and slightly extend the example from the Rationale, consider the code (assume for simplicity a commonplace 32-bit implementation):
unsigned x;
unsigned evil(double *p)
{
if (x) *p = 1.0;
return x;
}
...
unsigned y;
int main(void)
{
if (&y == &x + 1)
{
unsigned res;
x=1;
res = evil((double*)&x);
printf("You get to find out the first word of 1.0; it's %08X.\n", res);
}
else
{
printf("You don't get to find out the first word of 1.0; too bad.\n");
}
return 0;
}
In the absence of the "strict aliasing rule", a compiler processing evil
would have to allow for the possibility that it might be invoked as shown in test
on an implementation which might happen place two int
values consecutively in such a way that a double
could fit in the space occupied thereby. The authors of the Rationale recognized that if a compiler returned the value of x
that had been seen by the if
, the result would be "incorrect" in such a scenario, but even most advocates of type punning would admit that a compiler that did so (in cases like that) would often be more useful than one that reloaded x
(and thus generated less efficient code).
Note that the rules as written aren't describe all cases where implementations should support type punning. Given something like:
union ublob {uint16_t hh[8]; uint32_t ww[4]; } u;
int test1(int i, int j)
{
if (u.hh[i])
u.ww[j] = 1;
return u.hh[i];
}
int test2(int i, int j)
{
if (*(u.hh+i))
*(u.ww+j) = 1;
return *(u.hh+i);
}
int test3(int i, int j)
{
uint16_t temp;
{
uint16_t *p1 = u.hh+i;
temp = *p1;
}
if (temp)
{
uint32_t *p2 = u.ww+j;
*p2 = 1;
}
{
uint16_t *p3 = u.hh+i;
temp = *p3;
}
return temp;
}
static int test4a(uint16_t *p1, uint32_t *p2)
{
if (*p1)
*p2 = 1;
return *p1;
}
int test4(int i, int j)
{
return test4a(u.hh+i, u.ww+j);
}
Nothing in the Standard, as written, would imply that any of those would have defined behavior unless they all do, but the ability to have arrays within unions would be rather useless if test1
didn't have defined behavior on platforms that support the types in question. If compiler writers recognized that support for common type punning constructs was a Quality of Implementation issue, they would recognize that there would be little excuse for an implementation failing to handle the first three, since any compiler that isn't deliberately blind would see evidence that the pointers were all related to objects of common type union ublob
, without feeling obligated to handle such possibilities in test4a
where no such evidence would exist.