1

Given the code:

#include <stdlib.h>
#include <stdint.h>

typedef struct { int32_t x, y; } INTPAIR;
typedef struct { int32_t w; INTPAIR xy; } INTANDPAIR;

void foo(INTPAIR * s1, INTPAIR * s2)
{
  s2->y++;
  s1->x^=1;
  s2->y--;
  s1->x^=1;
}

int hey(int x)
{
  static INTPAIR dummy;
  void *p = calloc(sizeof (INTANDPAIR),1);
  INTANDPAIR *p1 = p;
  INTPAIR *p2a = p;
  INTPAIR *p2b = &p1->xy;
  p2b->x = x;
  foo(p2b,p2a);
  int result= p2b->x;
  free(p);
  return result;
}

#include <stdio.h>

int main(void)
{
    for (int i=0; i<10; i++)
      printf("%d.",hey(i));
}

Behavior depends upon gcc optimization level, which implies that gcc thinks this code invokes Undefined Behavior (the definition of "foo" collapses to nothing, but interestingly the definition of "hey" increments the value passed in). I'm not quite sure what if anything it does that runs afoul of the Standard's rules, though.

The code very deliberately and evilly constructs two pointers such that s2a->y and s2b->x will alias, but the pointers are deliberately constructed in such a way that both identify legitimate potential objects of type INTPAIR. Because code used calloc to get the memory, all field members have legitimate initial defined values of zero. All accesses to the allocated memory are done via an int32_t member of an INTPAIR*.

I can understand why it would make sense for the Standard to forbid aliasing structure fields in this fashion, but I couldn't find anything in the Standard which actually does so. Is gcc operating in Standard-compliant fashion here, or is it violating some clause in the Standard which isn't referenced by Annex J.2 and doesn't use any of the terms I searched for?

supercat
  • 77,689
  • 9
  • 166
  • 211
  • Probably a duplicate of http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule. Strict Aliasing means the undefined behavior is triggered anytime you access the same underlying memory via two different types - INTPAIR and INTANDPAIR. The C99 standard language that says it is in the linked answers. GCC decided to let its optimizer exploit this rule to determine whether the two variables "aliased" or not. Previously, while "always" UB, the rule was widely violated and much code depended it working in spite of this. So GCC is right, but a somewhat Pyrrhic victory. – Anders Nov 27 '15 at 08:14
  • @Anders: This question is much more specific than the linked one. Does the statement `INTPAIR *p2b = &p1->xy;` have any authorization to read, modify, otherwise "access" the allocated storage, or is there anything in the Standard that would allow the type of lvalue `*p2b` to be regarded as anything other than INTPAIR, regardless of where that pointer came from? – supercat Nov 27 '15 at 14:57
  • Actually it isn't more really more specific, it is the exact same situation. But I will try to answer why - as it happens I am just going over the strict aliasing stuff myself. I was quite surprised to wake up one day and find out that something that was just the way things were done for decades was both undefined behavior AND worse it mattered and broke things... – Anders Nov 27 '15 at 17:38
  • @Anders: I didn't notice any of the linked answers talking about a situation where memory was never actually *accessed* using disparate types. Given a pointer to a potential object of structure type, taking the address of a member neither reads nor modifies anything having to do with that potential object; I thus cannot see how it could qualify as an "access". Further, if code constructed a variable of type `INTANDPAIR`, computed the offset of the `INTPAIR` within it, and added that to the address returned by "malloc", and cast to `INTPAIR*`, that would yield a legitimate pointer to a... – supercat Nov 27 '15 at 18:23
  • ...potential object of type `INTPAIR`, which had no affiliation whatsoever with type `INTANDPAIR`. If the code that did that were in a separate compilation to `foo`, but were to pass `foo` that pointer along with the base pointer from the malloc, there's no way `foo` could work correctly (since the compiler optimizes it out to nothing). – supercat Nov 27 '15 at 18:27
  • I see what you are getting at. I'm working on my attempted answer now. Short version: just about anything tricky is potentially UB in one way or another. We all know (now) that accessing things this way is going to be a problem, but why in this case? Based on a little testing, I still think that it is GCC's dependence on Type Based Alias Analysis (TBAA) that is fooling it, and that (one of) the rules being broken is 6.5. – Anders Nov 27 '15 at 18:51
  • In N1570, section 6.5 is entitled "Expressions" and spans pages 76 to 105. Do you mean some particular part of that, or section 6.5 of some other version of the Standard? – supercat Nov 27 '15 at 19:33
  • Sorry 6.5/6 and 6.5/7 mainly. See if the reading in my answer makes sense. – Anders Nov 27 '15 at 20:04
  • @supercat Does code in this question invoke the strict aliasing behavior https://stackoverflow.com/questions/34826036/dereferencing-this-pointer-gives-me-46-but-i-am-not-sure-why – Suraj Jain Dec 30 '17 at 03:26
  • @SurajJain: I don't think so. On a little-endian machine, accessing the first `char` of a longer type holding 1234 will yield the bit pattern 11010010, which when interpreted as a signed char yields -46. – supercat Dec 30 '17 at 19:13

2 Answers2

0

UPDATE: I felt this answer was OK, but not still a little imprecise, and not cut and dry as to what the UB was. After a lot of very interesting discussion and comments I have tried again with a new answer

The right part of the C99 standard is quoted in this answer. I'm copying it here for convenience. The question and several of the answers are quite thorough.

(C99; ISO/IEC 9899:1999 6.5/7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types 73) or 88):

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

73) or 88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.

What is an effective type then? (C99; ISO/IEC 9899:1999 6.5/6:

The effective type of an object for an access to its stored value is the declared type of the object, if any. 87) If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

87) Allocated objects have no declared type.

So at the line p2b->x = x the object at p+4 becomes of effective type INTPAIR. Is it aligned correctly? If it isn't then Undefined Behavior (UB). But to keep it interesting, assume it is as it must be in this case because of the layout of INTANDPAIR.

By the same analysis there are two 8 byte objects, p2a (s2) at @(p+4) and p2b @p. As your example is demonstrating the 2nd element of p2a and the first of p2b end up being aliased.

In the foo(), the object p2b @p+4 is accessed by the normal method via s1->x. But then the "stored value" of object p2b is also accessed by a side effect of modifying a different object p2a @p. Since this falls under none of the bullets of 6.5/7, it is UB. Note that 6.5/7 says only, so objects shall not be accessed in any other ways.

I think the main distinction is that the "object" in question is the whole structure p2a/s2 and p2b/s1, not the integer members. If you change the argument of the function to take the integers and alias them it works "fine" because the function can't know s1 and s2 alias. For example:

void foo2(int *s1, int *s2)
{
  (*s2)++;
  (*s1)^=1;
  (*s2)--;
  (*s1)^=1;
}
  ...
/*foo(p2b,p2a);*/
foo2((int*)p, (int*)p); /* or p+4 or whatever you want */

This more or less confirms that this is the way GCC chose to interpret things: modifying a member is modifying the whole struct object and that since side effects of modifying one object are not on the listed legal ways to indirectly modify a different object, whee! we can do whatever silly thing we feel like doing.

So whether GCC interprets the ambiguities in standard to decide that by deriving s1 and s2 pointers through different typed pointers and then accessing them constitutes indirectly accessing the memory via different original types via p1 and p or whether it interprets the standard in the way I'm suggesting that "object" s2->y modifies is not just the integer but the s2 object, it is UB either way. Or is GCC just being especially snarky and pointing out that if the standard doesn't very clearly specify the semantics of dynamically allocated yet overlapping objects, it is free to do whatever it wants because by definition it is "undefined".

I don't think at this microscopic level anyone other than the standards body can definitively answer whether this should be UB or not because at this level it requires some "interpretation". The GCC's implementers opinion's seem to favor very aggressive interpretations.

I like Linus's reaction to this whole thing. And it is true, why not just be conservative and let the programmer tell the compiler when it is safe? Very Excellent Linus Rant

Community
  • 1
  • 1
Anders
  • 2,270
  • 11
  • 19
  • And by the way, the fact the we have to drag out and dissect a 600 page standard to figure out whether the reason a 20 line program doesn't work is the compiler's fault or the code's should tell us that a) the language and standard is starting to get too fancy for its own good and b) whether or not it is "legal", relying on anything even slightly tricky or requiring interpretation is going to end badly. – Anders Nov 27 '15 at 20:07
  • Is there any way a compiler could fail to align p2b legitimately? The offset of INTANDPAIR.xy is not specified by the Standard, but it must satisfy the alignment of INTPAIR, must it not? Otherwise, while I agree with your analysis of what's aliasing (it was very deliberate, of course) I don't see where 6.5.7 says anything about what *objects* may alias. All I see it talking about is what *types* may be used for an access, and any type which would be legal to access one object would be legal for the other. – supercat Nov 27 '15 at 20:07
  • Personally, I think the Standard could be much more useful for programmers and compiler writers alike if it described the cases in which writes to objects using various lvalues were and were not sequenced relative to each other. Doing that would make it possible to assign clear meanings to directives that would enforce ordering beyond those implied by the type rules, and would allow the special rules for "character types" to be deprecated (it's very common for code to access a bunch of characters within a loop, and require that the first access in the loop be sequenced after the last... – supercat Nov 27 '15 at 20:09
  • ...access of any type preceding the loop, and require that the last character access within the loop be sequenced before the first access of any type following the loop, but not care about sequencing within the loop. Unfortunately, the rules of C don't let code *say* that, which severely limits a compiler's ability to optimize code in places which are more likely than most to be performance hot spots. In any case, I'm not clear what you're seeing in 6.5.7 which would disallow code from changing multiple objects if rules about object types are complied with. – supercat Nov 27 '15 at 20:13
  • @Anders: IMO, the reason why the 20 line program shouldn't work is obvious, due to playing fast and loose with types; dragging out the 600 page standard is only needed to confirm there isn't some technicality that implies it really should work after all. –  Nov 27 '15 at 20:13
  • @supercat Well, there could be pad between w and xy. Not so much alignment as layout. And then the aliasing failure wouldn't work right. Otherwise yes it should align p2b correctly - I was generalizing. 6.5/7 specifies the "only" ways objects can be accessed - if it isn't on the list then it isn't OK. This is worth clarifying. – Anders Nov 27 '15 at 20:15
  • @Hurkyl Agreed - for this 20 line program. In general, it is more difficult to see. Some of these mistakes are insidious and it should be very clear what is allowed and what is not. And if we say what is allowed is simply that you cannot make any assumptions about the correlation between real memory layout and higher level concepts in C, then why not just skip it and use java or something. The direction of the language worries me... – Anders Nov 27 '15 at 20:19
  • @Hurkyl: The reason the program "doesn't work" is that it was deliberately designed to be a short example that invokes a particular compiler behavior. Most real-world situations where such behavioral issues would crop up would be more complicated and might make it harder to identify the point, if any, where the code first invokes Undefined Behavior. The rule that seems to be applied is that lvalue access must not only be done through exact types, but also through a consistent sequence of type members. That would actually be a very sensible rule, but what in the Standard *says* that? – supercat Nov 27 '15 at 20:22
  • @Anders: Do you like the idea of basing rules upon sequencing relations? Multiple unsequenced accesses to an object invoke UB unless none of them are writes; in an expression like "i=i+1;", the data dependency between the read and write would create a sequencing relation. The "effective type" rules assume a fictitious execution model where memory holds the effective types of objects, but it wouldn't make any sense for compilers to use anything like such a model when executing code. Since what compilers really need is the ability to reorder reads and writes... – supercat Nov 27 '15 at 20:30
  • ...I think rules targeted toward that need could then be described in terms which fit well with what compilers were actually doing. It's too bad that the language regards array-type parameters and pointer-type parameters as synonymous, since it would make sense to say that given e.g. `void x(int a[], int b[])`,operations on `a[1]` and `b[1]` would be sequenced, but operations on `a[1]` and `b[2]` would not, but given `void x(int *a, int *b)`, all operations involving `a[anything]` and `b[anything]` would be fully sequenced. – supercat Nov 27 '15 at 20:34
  • @supercat I think I like it better. It seems to be a closer fit to the way the processors actually see things; just aliasing at the compiler level is only one half of the problem. But at the end of the day, if the compiler wants to optimize something it should be incumbent on the compiler to prove that it is OK (don't alias) and if it can't to be conservative. So in addition I'd favor rules that limit what assumptions the compiler makes, not the user. – Anders Nov 27 '15 at 20:46
  • @supercat - I'm adding one more good reference to the standard to address your question (I hope) about overlapping different objects more satisfactorily. See 6.5.16.1/3. – Anders Nov 27 '15 at 20:47
  • @Anders: Linus' rant is even better when read in context, though I think access-method-based aliasing would be reasonable if there existed "escape hatches" other than the archaic "character type" rules which may have been helpful for porting some old code, but are far more burdensome for compilers and programmers alike than aliasing barriers would be. – supercat Nov 27 '15 at 20:47
  • @Anders: I don't think 6.5.16.1/3 is applicable here; the only situation I can see where it would apply to p2a and p2b would be if code attempted to do `*p2a=*p2b`; or something similar, where overlapping objects were read and written *in the same operation*. Here, the operations on p2b->x and p2a->y are separated by sequence points. – supercat Nov 27 '15 at 20:50
  • @Anders: This is getting a bit long. Care to chat? As for being "conservative", I would suggest that a good for the language would be to define a directive that would allow a compiler to make very broad assumptions about code, but also define alias-barrier directives to block optimizations in those narrow cases where they would cause problems. Code which doesn't define its requirements could be deprecated, but remain usable using rules set via the command line or other such means. In most situations where programmers would need to reuse memory as different types, there are clear places... – supercat Nov 27 '15 at 20:54
  • Oh, !@#* right you are. I'll try to find the right reference. I'm certain that pointer-based punning is declared UB somewhere. – Anders Nov 27 '15 at 20:55
  • ...where that would need to occur. If there were a means of notifying compilers about where those places were, then compilers could be given much broader authority to reorder everything else. – supercat Nov 27 '15 at 20:55
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/96385/discussion-between-anders-and-supercat). – Anders Nov 27 '15 at 20:59
0

My previous answer was lacking, maybe not completely wrong, but the sample program is deliberately designed to sidestep each of the more obvious explicit Undefined Behaviors (UB) dictated by the C99 standard, like 6.5/7. But with both GCC (and Clang) this example demonstrates strict aliasing failure like symptoms under optimization. They appear to be assuming s1->y and s2-x can't alias. So, is the compiler wrong? Is this a loophole in the strict aliasing legalese?

Short answer: No. I wouldn't be surprised if there was a loophole of some kind in the standard, given its complexity. But in this example, creating overlapping objects on the heap is explicitly undefined behavior, and there are several other things happening that the standard does not define.

I think the point of the example is not that it fails - it is obvious that "playing fast and loose" with pointers is a bad idea and relying on corner cases and legalese to prove the compile "wrong" is of little help if the code doesn't work. The key questions are: is GCC wrong? and what in the standard says so.

First, lets look at the obvious strict aliasing rules and how this example is trying to avoid them.

C99 6.5/7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 76)

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

This is the main strict aliasing section. It means that accessing the same memory via two different type pointers is UB. This example sidesteps it by accessing both using INTPAIR pointers in foo().

The key problem with this is that it is talking about accessing the stored value via two different effective types (e.g. pointers). It doesn't talk about accessing via two different objects.

What is being accessed? is it the integer member or the entire object s1 / s2? Is accessing s2->x via s1->y access via "a type compatible with the effective type of the object". I believe an argument can be made that a) the access as a side effect of modifying a different object does not fall under the permissible methods in 6.5/7 and that b) modifying one member of the aggregate transitively modifies the aggregate (*s1 or *s2) also.

Since this is not specified, it is UB, but it is a bit hand-wavy.

How did we get pointers to two overlapping objects? Are the pointer casts leading to them OK? Section 6.3.2.3 contains the rules for casting pointers and the example carefully does not violate any of them. In particular, because p2b is a pointer to INTANDPAIR member xy the alignment is guaranteed to be right, otherwise it would definitely run afoul of 6.3.2.3/7.

Furthermore, &p1->xy is not a problem - it can't be - it is a perfectly legitimate pointer to an INTPAIR. Simply casting pointers and/or taking addresses is safely outside the definition of "access" (3.1/1).

It is clear that the problem comes about by accessing two integer members that overlay each other as different parts of overlapping objects. Any attempt to do this via pointers of different types would clearly run afoul of 6.5/7. If accessed by the same type pointer at the same address, there would be no problem whatsoever. So the only way left that they could alias this way is that if two objects at different addresses overlapped in some fashion.

Obviously this could occur as part of a union, but that is not the case for this example. Type punning through unions may not be UB in C99, but it would be a different question whether a variant of this example could be made misbehave via unions.

The example uses dynamic allocation and casts the resultant void pointer to two different types. Going from from a pointer to an object to void * and back again is valid (6.3.2.3/1). Several other ways of obtaining pointers to objects that would overlap are explicitly UB by the pointer conversion rules of 6.3.2.3, the aliasing rules of 6.5/7, and/or the compatible type rules 6.2.7.

So what else is wrong?

6.2.4 Storage durations of objects

1 An object has a storage duration that determines its lifetime. There are three storage durations: static, automatic, and allocated. Allocated storage is described in 7.20.3

The storage for each of the objects is allocated by calloc() so the duration we want is "allocated". So we check 7.20.3: (emphasis added)

7.20.3 Memory management functions

1 The order and contiguity of storage allocated by successive calls to the calloc, malloc, and realloc functions is unspecified. The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). The lifetime of an allocated object extends from the allocation until the deallocation. Each such allocation shall yield a pointer to an object disjoint from any other object.

...

2 The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, 25) and retains its last-stored value throughout its lifetime. 26) If an object is referred to outside of its lifetime, the behavior is undefined.

To avoid UB, the accesses to the two different objects must be to a valid object within its lifetime. You can get a single valid object (or an array) with malloc()/calloc(), but these guarantee that you will receive a pointer disjoint from all other objects. So is the object returned from calloc() p or is it p1? It can't be both.

The UB is triggered by attempting to reuse the same dynamically allocated object to hold two objects that are not disjoint. While calloc() guarantees it will return a pointer to a disjoint object, there is nothing that says it will still work if you then start using parts of the buffer for a 2nd overlapping one. In fact, it even explicitly says it is UB if you access an object outside its lifetime and there is only a single allocation ergo a single lifetime.

Also note:

4. Conformance

  1. In this International Standard, ‘‘shall’’ is to be interpreted as a requirement on an implementation or on a program; conversely, ‘‘shall not’’ is to be interpreted as a prohibition.
  2. If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe ‘‘behavior that is undefined’’.

For this to be a compiler error it must fail on a program that only uses constructs explicitly defined. Anything else is outside the safe-harbor and is still undefined, even if it the standard doesn't explicitly state that it is Undefined Behavior.

Community
  • 1
  • 1
Anders
  • 2,270
  • 11
  • 19
  • I think the "shall" in 7.20.3(1) is imposing a requirement on the implementation, rather than on the programmer, saying that e.g. a "malloc" function that always returned the same pointer wouldn't be legitimate. 7.20.3(2) is saying that memory can't be used for everything after calling "free" on it. As for the idea that side-effects on overlapping aggregates produce UB regardless of type, is there anything in the Standard that actually *says* that? I would posit that given the above code since no primitive type gets modified in a bizarre fashion the code would have two possible clear and... – supercat Nov 28 '15 at 15:31
  • ...unambiguous meanings if translated straightforwardly based upon whether INTANDPAIR includes enough padding before xy to avoid aliasing. If it does, there is no confusion. If it doesn't, then the only question is whether any of the rules that make certain behaviors UB to avoid forcing compilers to use clear and straightforward (but sub-optimal) code apply in this case. – supercat Nov 28 '15 at 15:34
  • The point is, and I am guilty of it, is that we forget that not that the standard must state somewhere that a behavior is undefined. If it doesn't define something, it is undefined - the same kind of UB that is explicitly stated. And it does say that. – Anders Nov 28 '15 at 15:58
  • As far as 7.20, yes it is a requirement on the implementation, but it also bounds what you get from it. You get a singular object. There is no other way to get an object, and you can't dereference a pointer to not-an-object. There are a lot of places this code narrowly is defensible by the standard, but if it doesn't say where overlapping objects is allow and that these circumstances qualify, UB for that if nothing else. – Anders Nov 28 '15 at 16:01
  • I would suggest that there is an implementation-defined relationship between changes made to the allocated object using other object types and the memory that would be seen using a character pointer, and there is an implementation-defined relationship between changes made to the allocated object using a character pointer and the values seen using other character types. I would suggest that in the absence of explicit rules to the contrary, those relationships would *define* a behavior. The effective-type rules relieve a compiler from the responsibility of following an "as-if" rule... – supercat Nov 28 '15 at 18:07
  • ...with objects' representations when the requirements of the rules are violated, but unless code violates a specific rule, I'd say the rules regarding representation would imply that the behavior of the code should be determined by the offset of `INTANDPAIR.xy` unless the code violates some specific requirement in the Standard. – supercat Nov 28 '15 at 18:11
  • That would not be "explicitly" defining it though. It is also a bit hand-wavy. – Anders Nov 28 '15 at 18:39
  • It's a little hand-wavy, but given that objects have well-defined bit patterns which can be observed, and that explicitly setting the backing storage of objects to those bit-patterns is a means of setting their value, I don't consider it a particular stretch. – supercat Nov 28 '15 at 20:12