There are a lot of programs which would have been considered valid under C89, prior to the publication of C99, which some people insist were never valid. C89 includes a rule that requires that an object of any type may only be accessed using a pointer of that type, a related type, or a character type. Prior to the publication of C99, this rule was generally interpreted as applying only to "named" objects (variables of static or automatic duration which are accessed directly by name), and only in situations where the object in question didn't have its address taken immediately before it was used as a different pointer type. Such interpretation was motivated by a number of factors:
One of the stated goals of the Standard was to fit with what existing compilers and programs were doing, and while it would have been rare for existing programs to access discrete named variables using pointers of different types other than in cases where the variable's address was taken immediately before such use, many other usages of pointer type punning were quite common.
The rationale for the Standard includes as its sole example a function which receives a pointer of one primitive type to write a global variable of another primitive type in such a way that a compiler would have no particular reason to expect aliasing. Being able to keep global variables in registers is clearly a useful optimization, and the stated purpose of the rule is to allow such optimizations in cases where a compiler would have no reason to expect aliasing to occur. Outlawing constructs like like (int*)&foo=23;
does nothing to aid such optimizations, since the fact that code is taking foo
's address and dereferencing it should make it abundantly clear to any compiler that isn't being deliberately obtuse that the code is going to modify foo
.
There are many kinds of code which require semantically the ability to use memory bits as various types, and nothing in the Standard indicate that the rules were intended to make programmers jump through hoops (e.g. by using memcpy) to achieve semantics that could have been easily obtained in the absence of the rules, especially considering that using memcpy would prevent the compiler from keeping global variables in registers across the pointer accesses (thus defeating the purpose for which the rules were written in the first place).
If structure types V
and W
have a common initial sequence, U
is any union type containing both, and p
is a V*
which identifies the V
within a U
, then (W*)(U*)p
may be used to access those common members, and will be equivalent to (W*)p
. Unless a compiler could show that p
couldn't possibly be a pointer to a member of some union containing W
, it would be required to allow (W*)p
to access the common members; it was more helpful to simply treat such common member access as being legitimate regardless of whether or where U
might exist than to search for excuses to deny it.
Nothing in the C89 rules makes clear how the "type" of a region of allocated storage is defined, or how storage which holds things of one type that are no longer needed might be re-purposed to hold things of another.
Keeping track of registers allocated to named variables was easier than keeping track of registers allocated to other pointer exceptions, and code which was interested in minimizing the number of loads and stores via pointers would often copy things to named variables and work on them there.
C99 added "effective type" rules which are explicitly applicable to allocated storage. Some people insist those were merely "clarifications" of rules which already existed in C89, but for the above reasons I find that viewpoint untenable. It's fashionable to claim that the only reasons compilers didn't apply aliasing rules to unnamed objects are #5 and #6, but objections #1-#4 are equally significant (and continue to apply to C99 just as much as C89). Still, since C99 added the effective type rules, many constructs which would have been treated as legitimate by most common interpretations of the C89 rules are clearly forbidden.