12

While read another question about aliasing ( What is the strict aliasing rule? ) and its top answer, I realised I still wasn't entirely satisfied even though I think I understood it all there.

(This question is now tagged as C and C++. If your answer refers to just one of these, please clarify which.)

So I want to understand how to do some development in this area, casting pointers in aggressive ways, but with a simple conservative rule that ensures I don't introduce UB. I have a proposal here for such a rule.

(Update: of course, we could just avoid all type punning. But that's not very educational. Unless of course, there are literally zero well-defined exceptions, beyond the union exception.)

Update 2: I understand now why the method proposed in this question is not correct. However, it is still interesting to know whether a simple, safe, alternative exists. As of now, there is at least one answer that proposes such a solution.

This is the original example:

int main()
{
   // Get a 32-bit buffer from the system
   uint32_t* buff = malloc(sizeof(Msg));

   // Alias that buffer through message
   Msg* msg = (Msg*)(buff);

   // Send a bunch of messages    
   for (int i =0; i < 10; ++i)
   {
      msg->a = i;
      msg->b = i+1;
      SendWord(buff[0] );
      SendWord(buff[1] );   
   }
}

The important line is:

Msg* msg = (Msg*)(buff);

which means there are now two pointers (of different types) pointing to the same data. My understanding is that any attempt to write through one of these will render the other pointer essentially invalid. (By 'invalid' I mean that we can ignore it safely, but that reading/writing through an invalid pointer is UB.)

Msg* msg = (Msg*)(buff);
msg->a = 5;           // writing to one of the two pointers
SendWord(buff[0] );   // renders the other, buffer, invalid

Therefore, my proposed rule is that, once you create the second pointer (i.e. create msg), you should immediately and permanently 'retire' the other pointer.

What better way to retire a pointer than to set it to NULL:

Msg* msg = (Msg*)(buff);
buff = NULL; // 'retire' buff. now just one pointer
msg->a = 5;

Now, the last line assigning to msg->a can't invalidate any other pointers because, of course, there are none.

Next, of course, we have to find a way to call SendWord(buff[1] );. This can't be done immediately because buff has been retired and is NULL. My proposal now is to cast back again.

Msg* msg = (Msg*)(buff);
buff = NULL; // 'retire' buff. now just one pointer
msg->a = 5;

buff = (uint32_t*)(msg);   // cast back again
msg = NULL;                // ... and now retire msg

SendWord(buff[1] );

In summary, every time you cast a pointer between two 'incompatible' types (I'm not sure how to define 'incompatible'?) then you should immediately 'retire' the old pointer. Set it to NULL explicitly if that helps you to enforce the rule.

Is this conservative enough?

Perhaps this is too conservative and has other problems, but I first want to know if this is conservative enough to avoid introducing UB via offending strict aliasing.

Finally, recap the original code, modified to use this rule:

int main()
{
   // Get a 32-bit buffer from the system
   uint32_t* buff = malloc(sizeof(Msg));

   // Send a bunch of messages    
   for (int i =0; i < 10; ++i)
   {  // here, buff is 'valid'

      Msg* msg = (Msg*)(buff);
      buff = NULL;
      // here, only msg is 'valid', as buff has been retired
      msg->a = i;
      msg->b = i+1;
      buff = (uint32_t*) msg;  // switch back to buff being 'valid'
      msg = NULL;              // ... by retiring msg
      SendWord(buff[0] );
      SendWord(buff[1] );
      // now, buff is valid again and we can loop around again
   }
}
Community
  • 1
  • 1
Aaron McDaid
  • 26,501
  • 9
  • 66
  • 88
  • 4
    Rule o' thumb: Don't do type-punning. Some instances of it are well-defined but most aren't and unless there is a damn good reason, you can usually write a more beautiful solution that does not involve type punning. – fuz Jul 15 '15 at 11:47
  • @FUZxxl, in a sense I agree. I've never had to do this, and I don't need to do it now. But I'm curious. Some time in my life, I may have no option but to push the boundaries a little. If everybody says "don't do it", or "it's always UB", then I will have no option but to just code it up and say to my boss "well, it worked in my tests, so I'm going to have to run with it because I can't get any other helpful advice" :-) . – Aaron McDaid Jul 15 '15 at 12:26
  • possible duplicate of [What is the strict aliasing rule?](http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) – this Jul 15 '15 at 12:31
  • ... unless, of course, there truly is *never* *any* defined-behaviour version of type-punning. I'm beginning to get the impression that this might be the case in C++ specifically, as it may be much stricter than C in this. – Aaron McDaid Jul 15 '15 at 12:31
  • I know some people are marking this as a dupe of another question. But I referenced exactly that question in the first few words of my question, precisely in order to highlight that I have a different, follow up, question. That other question kind of misses the point for me. – Aaron McDaid Jul 15 '15 at 12:34
  • 1
    @AaronMcDaid Well, you asked for the “simplest rule of thumb,” do I gave you the simplest rule of thumb. – fuz Jul 15 '15 at 12:40
  • Btw. if you want to make the code so it doesn't break strict aliasing, that is a different question, and the process is relatively easy. – Šimon Tóth Jul 15 '15 at 13:22
  • I would like to write a detailed analysis of your code, however it is crucial as to whether `uint32_t` is a typedef for `unsigned int` on your system. If not, your code is clearly UB. If so, it is murky (But IMHO, not UB). I suggest updating your question so that either clearly the same type is used for `Msg::a` as for `buff[0]` etc., or clearly different types are used. An answer covering both cases would be too lengthy. – M.M Jul 15 '15 at 13:39
  • 1
    Also, dual-tagging this question makes it doubly complicated because the strict aliasing rule is different in C than in C++. In fact, in C++ it is very underspecified as to what happens when aliasing in malloc'd space. – M.M Jul 15 '15 at 13:41
  • I didn't dual-tag it. I tagged it as C++ at first, but somebody else must have added C. In hindsight, I am more interested in C. – Aaron McDaid Jul 15 '15 at 21:13
  • @MattMcNabb, I copied that example directly from another SO answer, and I didn't realise the signed-ness was an issue. – Aaron McDaid Jul 15 '15 at 21:14
  • @AaronMcDaid signed-ness isn't an issue, the question is whether the two types are the same or not. The code you copied from has the same problem. Please fix. – M.M Jul 16 '15 at 00:28

4 Answers4

7

C++ answer: that won't work. The C++ strict aliasing rule explicitly enumerates which types can be used to access an object. If you use a different type, you get UB, even if you've "retired" all access methods of a different type. As per C++14 (n4140) 3.10/10, the allowed types are:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

  • the dynamic type of the object,
  • a cv-qualified version of the dynamic type of the object,
  • a type similar (as defined in 4.4) to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
  • a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
  • a char or unsigned char type.

"Similar types," as per 4.4, pertain to modifying cv-qualification of multi-level pointers.

So, if you've ever written into an area through a pointer (or other accessor) to one type, you cannot access it through a pointer to a different type (unless sanctioned by 3.10/10), even if you forget the old pointer.

If you've never written to an area through a particular type, casting pointers back and forth is not an issue.

Angew is no longer proud of SO
  • 167,307
  • 17
  • 350
  • 455
  • "If you use a different type, you get UB". That seems so broad as to write all typepunning in C++ as UB? Surely something is allowed? I guess I object to the word 'use' here as it is pretty vague. Does the initial cast count as "use". Or do you have to read/write through the casted pointer. – Aaron McDaid Jul 15 '15 at 12:01
  • A really dumb short example to clarify my previous comment: `int main() { long x = 5; long *xp = &x; int *ip = (int*) xp; *ip = 4; } ` Is this UB? – Aaron McDaid Jul 15 '15 at 12:03
  • The justification for the strict aliasing rule is usually given in terms of optimizations that assume that certain pointers/lvalues point to different data. By retiring one pointer, I am removing this issue and not allowing any inappropriate optimizations, because the compiler can no longer make incorrect optimizations. So, strict aliasing is still an issue even though I've worked around the commonly-given reason for the rule?! – Aaron McDaid Jul 15 '15 at 12:17
  • @AaronMcDaid If you have two pointers with incompatible types, they are effectively `restrict` pointers. So the compiler has no obligation to ever update the data seen from one pointer to the other. – Šimon Tóth Jul 15 '15 at 12:20
  • @Aaron: Union type punning is allowed. If you need to type pun, that's how to do it. – Kevin Jul 15 '15 at 12:24
  • 2
    @AaronMcDaid Yes, your `int`+`long` example *is* UB according to the C++ standard. I cannot speak for C. – Angew is no longer proud of SO Jul 15 '15 at 12:28
  • @Agnew: Then why is it in the list you posted? – Kevin Jul 15 '15 at 12:31
  • @Kevin OK, maybe you're right. Seems I've always misinterpreted that bullet point. – Angew is no longer proud of SO Jul 15 '15 at 12:32
  • @Let_Me_Be, I never implied that it had an " obligation to ever update the data seen from one pointer to the other". My example is strictly constructed to avoid any assumptions about whether one pointer can see what happened through another pointer. – Aaron McDaid Jul 15 '15 at 12:52
  • @AaronMcDaid In your example, this line `buff = (uint32_t*) msg;` can be safely optimized out, as neither buff nor msg have changed. – Šimon Tóth Jul 15 '15 at 13:01
  • @Let_Me_Be, that line can't be removed - it *does* introduce a change. `buff` is NULL before that line, and non-NULL after that line. This is important, because it means the dereference of buff shortly after (`buff[0]`) is better-defined because `buff` is now non-null. – Aaron McDaid Jul 15 '15 at 13:41
  • 1
    In C++ it is unclear what the type is of malloc'd space. C has a rule that writing into malloc'd space "imprints" the type of the expression doing the writing, but C++ has no such rule. Also it is unclear in C++ whether copying objects into malloc'd space causes lifetime to begin for an object in the malloc'd space. (The standard does not say that it does) – M.M Jul 15 '15 at 13:43
  • @AaronMcDaid The `buff = NULL;` will of course go as well. What you have is: set buff to null; buff not used; set buff to original value. This can be completely optimized out. – Šimon Tóth Jul 15 '15 at 13:45
1

My understanding is that any attempt to write through one of these will render the other pointer essentially invalid.

As long as you don't access the type-punned pointer, the other, "official" one is ok. However, if you do that, it will cause undefined behavior, which may just work, do what you said or something out of this galaxy, including making the other pointer invalid. Compilers can treat UB at their pleasure.

The only way to make buff a valid pointer to Msg is memcpy/memmove, according to the standard:

memcpy( (void*)msg, (const void*) buff, sizeof (*msg));

Also, what triggers UB, is not only writing but also reading or whatever other way that accesses the object:

If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined:

Some compilers also allow "suspending" that rule such as GCC, clang and ICC (probably also MSVC) but that cannot be considered portable or standard behavior. Further techniques, and their code generation analysis, are thoroughly analyzed here.

Do you really need to break the strict-aliasing rule?

Most of the times, no, you do not need that. There are ways and ways to overcome that problem that involve perfectly legal solutions. In the above case, simply store a plain pointer within the struct and send each member in a determined format.

edmz
  • 8,220
  • 2
  • 26
  • 45
  • @Angew Yeah sure (I indeed talk about accessing the object). At least we get that. – edmz Jul 15 '15 at 11:55
  • I don't get the first line of this question. You're saying that if I write through a type-punned pointer, then I get UB, *even if I never read nor write through either pointer again*? – Aaron McDaid Jul 15 '15 at 12:05
  • @Angew With valid I mean "safe" to access, namely that the pointed address is valid. To initialize that region you have to copy the contents of the `MsgBuf` object in a such a way that aliasing is not a problem; `char*`, for example, as `memcpy` likely does. Does that help? – edmz Jul 15 '15 at 12:07
  • @AaronMcDaid Yeah, fixing. It's accessing the object through the type-punned pointer that causes UB. As Angew says, simply pointing to it does not cause any problems. – edmz Jul 15 '15 at 12:11
  • Sorry to be pedantic, but you can't open with "it's definitely UB", then quote an exception (`memcpy`) that makes it OK again. If `memcpy` is OK, then can we distil *why* it's OK? If I reimplemented my own `memcpy`, then it would be fine, yes? And `memcpy` has `void*`, not `char*`, in its signature, therefore the fact that `char*` is special isn't relevant to the justification of `memcpy`. (Sorry if my tone is wrong here, but it is important.) – Aaron McDaid Jul 15 '15 at 12:21
  • Sorry, mixed up `memcpy` on the contents and the pointer, ignore my previous comments. – Angew is no longer proud of SO Jul 15 '15 at 12:27
  • @AaronMcDaid Absolutely fine. [`memcpy`](http://en.cppreference.com/w/cpp/string/byte/memcpy) shall compute a byte copy of `n` bytes from `dst` to `src`. `void*` is used to represent an address and C allows implicit conversions to `void*` from other pointer types (unlike for `char*`); internally `src` and `dst` are converted properly. – edmz Jul 15 '15 at 12:37
  • The addresses pointed to by the parameters to `memcpy` should not overlap each other. But in the example we are using they do overlap each other, because one pointer is simply a cast from the other. Do you mean a different example where the two data structures are non-overlapping? – Aaron McDaid Jul 15 '15 at 12:41
  • @AaronMcDaid It depends when you call it. After you `malloc` the `uint32_t*`, they don't overlap and `memcpy` can be called safely; OTOH, when they do overlap, you need `memmove`, as quoted in the answer. – edmz Jul 15 '15 at 12:50
  • @black, "they don't overlap". We can't even ask if they overlap at this point - because `msg` doesn't even exist! I think you're discussing a situation where `buff` and `msg` have been `malloc`-ed seperately? – Aaron McDaid Jul 15 '15 at 12:56
  • In other words, your answer as currently written implies that we can simply copy and paste `memcpy( (void*)buff, (const void*) msg, sizeof (msg));` into the original (problematic) code and that will fix it? – Aaron McDaid Jul 15 '15 at 12:57
  • @AaronMcDaid Yes (look at the edit, it's the opposite). Whether `msg` will contain meaningful values depends though on the actual data into `buff`. – edmz Jul 15 '15 at 13:22
  • @black, I've now posted an answer of my own, which uses `memmove` in an aggressive manner, in order to fix the issue. Any feedback appreciated. – Aaron McDaid Jul 20 '15 at 16:14
  • And, returning to your answer, @black. I have downvoted because of the use of `memcpy`. That single line can't be inserted in my program - because the memory overlaps and `memcpy` can't be used with overlapping memory. Anyway, I guess you meant `sizeof(*msg)` instead of `sizeof(msg)`. – Aaron McDaid Jul 20 '15 at 21:27
  • @AaronMcDaid Feel free to do that; but that's not a reason for downvoting imho. If the memory overlaps use *memmove* which is just for that. – edmz Jul 20 '15 at 21:42
1

The rule is:

"Unless the pointers are of compatible types. You cannot have two pointers pointing to the same memory."

Here is a simpler example of an endless cycle:

1: int *some_buff = malloc(sizeof(whatever));
2: memset(some_buff,0,sizeof(whatever));
3: while (some_buff[0] == 0)
4: {
5:     whatever *manipulator = (whatever*)some_buff; 
6:     manipulate(manipulator);
7: }

This is essentially how the compiler will/can approach this code:

The test for some_buff[0] == 0 can be optimized out, because there is no valid way how the some_buff[0] could be changed. It is accessed through manipulator, but manipulator isn't of a compatible type, therefore according to the strict aliasing rule, the value of some_buff[0] cannot change.

If you want an even more simpler example:

int *some_buff = malloc(sizeof(whatever));
memset(some_buff,0,sizeof(whatever));
whatever *manipulator = (whatever*)some_buff;
manipulate(manipulator);
printf("%d\n",some_buff[0]);

It is perfectly OK for this code to always print zero and it doesn't matter what manipulate does.

Šimon Tóth
  • 35,456
  • 20
  • 106
  • 151
  • 3
    To be clear: Two incompatible pointers pointing to the same memory is OK. It depends what you do what those pointers as to whether you cause UB. But you can use that as a rule of thumb to make sure you steer well clear of any possible trouble. – M.M Jul 15 '15 at 13:26
  • "You cannot have two pointers pointing to the same memory." Yes you can. – edmz Jul 15 '15 at 13:35
  • @black And this differs from the previous comment how? – Šimon Tóth Jul 15 '15 at 13:37
  • @Let_Me_Be Not much, but you haven't corrected that in your answer yet, thing you may want to do. – edmz Jul 15 '15 at 13:41
  • @black The question is about a rule of thumb. Not a definition, or specification. This a good rule to avoid any issues, even though it will forbid valid cases. – Šimon Tóth Jul 15 '15 at 13:43
  • I think the loop is confusing things. Let's unroll the loop. The lines of code then (counting as in this answer currently) are 1,2,3,5,6,3,5,6,3,5,6,3,5,6,... . When *exactly* does this become UB? If the program was just 1,2,3, would it be defined? What about 1,2,3,5? 1,2,3,5,6? 1,2,3,5,6,3? 1,2,3,5,6,3,5? My current opinion is that 1,2,3,5,6 is DB (defined behaviour), but 1,2,3,5,6,3 is UB. If you agree, then I have an important follow-up question. Thanks! – Aaron McDaid Jul 15 '15 at 21:32
  • @AaronMcDaid If you unroll the loop you will end up with 1,2,5,6,5,6,5,6,5,6,5,6.... The test will never be evaluated. Discussing whether 1,2,3,5,6 or 1,2,3,5,6,3 is UB is meaningless, as that will not be how the code is interpreted. – Šimon Tóth Jul 15 '15 at 21:52
  • @AaronMcDaid Btw. I think the core misconception that you have trouble grasping is that when it comes to strict aliasing, the undefined behavior isn't something that happens at runtime, it manifests at compile time. – Šimon Tóth Jul 15 '15 at 22:29
  • @Let_Me_Be, I do not have that misconception. My question is: if we took lines 1,2,3,5,6 and put them in a program *on their own* and compiled *that short program*, would it be UB or not? – Aaron McDaid Jul 16 '15 at 06:57
  • @AaronMcDaid Yes, but it most likely wouldn't manifest. If the condition would be `some_buff[0] != 0` it would manifest as 1,2,3,5,6 would become 1,2,3. – Šimon Tóth Jul 16 '15 at 08:16
  • Your edit (the "even simpler example") is perfect. What if we remove the last line ( `printf("%d\n",some_buff[0]);`) and replace it with `int *another_buff = (int*) manipulator; printf("%d\n",another_buff[0]);`? If this is DB (defined behaviour), then this is *very* interesting to me. (And thanks for your patience!) – Aaron McDaid Jul 16 '15 at 08:40
  • @AaronMcDaid That is still undefined behavior, although the actual behavior of the compilers should be relatively consistent. But what can happen for example is: `memset(); int value = some_buff[0]; .... int *another_buff = (int*)manipulator; printf("%d\n",another_buff[0]);` in this can I can definitely see the compiler using the cached value from `value`. – Šimon Tóth Jul 16 '15 at 08:46
  • In your second example, I agree that `value` will be cached. But that doesn't change my understanding of my `int *another_buff` example. (...to be continued...) – Aaron McDaid Jul 16 '15 at 09:26
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/83411/discussion-between-let-me-be-and-aaron-mcdaid). – Šimon Tóth Jul 16 '15 at 09:28
  • Maybe we should back up a little. The memory is `memset` to zero. Then there is `whatever *manipulator = (whatever*)some_buff;` and `manipulate(manipulator);`. Does the `manipulate` function 'know' that the data has been zero-ed out earlier? If so, how is it entitled to know this. If not, then why do we zero it? – Aaron McDaid Jul 16 '15 at 09:29
0

Your suggestion doesn't help at all, because it doesn't matter what value you assign your pointer variable after using it. You do access the same memory location through pointers of incompatible types.

For C (not for C++), there is at least one safe thing to do other than avoiding type punning: You can safely cast pointers to structs, given that the one struct type just adds fields to the end of the other. This even works when the longer struct just contains the shorter as its first member: A pointer to a struct points to its first member. So e.g. these are safe in C:

typedef struct
{
    int id;
    const char *name;
} base_t;

typedef struct
{
    base_t base;
    long foo;
} derived_t;

derived_t *d = malloc(sizeof derived_t);
base_t *b = (base_t *)d;
int *i = (int *)d;
  • I am not claiming that setting the pointer to NULL actually has any meaning to the compiler. The purpose of that is to remind me not to use that pointer any more. And because it forces me to stop using that pointer, then I am not "access[ing] the same memory location through pointers of incompatible types." Or, more precisely I'm not accessing the same location with two differents *at the same time*. There is a clear 'handover', so to speak, at one point in the code (the cast) – Aaron McDaid Jul 15 '15 at 21:23
  • ... it's entirely possible over the lifetime of a program that, due to `malloc` and `free`, you have two different pointers of two different types pointing at the same location. And that's obviously not a problem. Therefore, there can only be a problem is they two pointers exist at the same time. I think. – Aaron McDaid Jul 15 '15 at 21:26