While read another question about aliasing ( What is the strict aliasing rule? ) and its top answer, I realised I still wasn't entirely satisfied even though I think I understood it all there.
(This question is now tagged as C and C++. If your answer refers to just one of these, please clarify which.)
So I want to understand how to do some development in this area, casting pointers in aggressive ways, but with a simple conservative rule that ensures I don't introduce UB. I have a proposal here for such a rule.
(Update: of course, we could just avoid all type punning. But that's not very educational. Unless of course, there are literally zero well-defined exceptions, beyond the union
exception.)
Update 2: I understand now why the method proposed in this question is not correct. However, it is still interesting to know whether a simple, safe, alternative exists. As of now, there is at least one answer that proposes such a solution.
This is the original example:
int main()
{
// Get a 32-bit buffer from the system
uint32_t* buff = malloc(sizeof(Msg));
// Alias that buffer through message
Msg* msg = (Msg*)(buff);
// Send a bunch of messages
for (int i =0; i < 10; ++i)
{
msg->a = i;
msg->b = i+1;
SendWord(buff[0] );
SendWord(buff[1] );
}
}
The important line is:
Msg* msg = (Msg*)(buff);
which means there are now two pointers (of different types) pointing to the same data. My understanding is that any attempt to write through one of these will render the other pointer essentially invalid. (By 'invalid' I mean that we can ignore it safely, but that reading/writing through an invalid pointer is UB.)
Msg* msg = (Msg*)(buff);
msg->a = 5; // writing to one of the two pointers
SendWord(buff[0] ); // renders the other, buffer, invalid
Therefore, my proposed rule is that, once you create the second pointer (i.e. create msg
), you should immediately and permanently 'retire' the other pointer.
What better way to retire a pointer than to set it to NULL:
Msg* msg = (Msg*)(buff);
buff = NULL; // 'retire' buff. now just one pointer
msg->a = 5;
Now, the last line assigning to msg->a
can't invalidate any other pointers because, of course, there are none.
Next, of course, we have to find a way to call SendWord(buff[1] );
. This can't be done immediately because buff
has been retired and is NULL. My proposal now is to cast back again.
Msg* msg = (Msg*)(buff);
buff = NULL; // 'retire' buff. now just one pointer
msg->a = 5;
buff = (uint32_t*)(msg); // cast back again
msg = NULL; // ... and now retire msg
SendWord(buff[1] );
In summary, every time you cast a pointer between two 'incompatible' types (I'm not sure how to define 'incompatible'?) then you should immediately 'retire' the old pointer. Set it to NULL explicitly if that helps you to enforce the rule.
Is this conservative enough?
Perhaps this is too conservative and has other problems, but I first want to know if this is conservative enough to avoid introducing UB via offending strict aliasing.
Finally, recap the original code, modified to use this rule:
int main()
{
// Get a 32-bit buffer from the system
uint32_t* buff = malloc(sizeof(Msg));
// Send a bunch of messages
for (int i =0; i < 10; ++i)
{ // here, buff is 'valid'
Msg* msg = (Msg*)(buff);
buff = NULL;
// here, only msg is 'valid', as buff has been retired
msg->a = i;
msg->b = i+1;
buff = (uint32_t*) msg; // switch back to buff being 'valid'
msg = NULL; // ... by retiring msg
SendWord(buff[0] );
SendWord(buff[1] );
// now, buff is valid again and we can loop around again
}
}