We all know that type punning like this
union U {float a; int b;};
U u;
std::memset(u, 0, sizeof u);
u.a = 1.0f;
std::cout << u.b;
is undefined behavior in C++.
It is undefined because after u.a = 1.0f;
assignment .a
becomes an active field and .b
becomes inactive field, and it's undefined behavior to read from an inactive field. We all know this.
Now, consider following code
union U {float a; int b;};
U u;
std::memset(u, 0, sizeof u);
u.a = 1.0f;
char *ptr = new char[std::max(sizeof (int),sizeof (float))];
std::memcpy(ptr, &u.a, sizeof (float));
std::memcpy(&u.b, ptr, sizeof (int));
std::cout << u.b;
And now it becomes well-defined, because this kind of type punning is allowed.
Also, as you see, u
memory remains same after memcpy()
calls.
Now let's add threads and the
volatile
keyword.
union U {float a; int b;};
volatile U u;
std::memset(u, 0, sizeof u);
u.a = 1.0f;
std::thread th([&]
{
char *ptr = new char[sizeof u];
std::memcpy(ptr, &u.a, sizeof u);
std::memcpy(&u.b, ptr, sizeof u);
});
th.join();
std::cout << u.b;
The logic remains same, but we just have second thread. Because of the volatile
keyword code remains well-defined.
In real code this second thread can be implemented through any crappy threading library and compiler can be unaware of that second thread. But because of the volatile
keyword it's still well-defined.
But what if there is no other threads?
union U {float a; int b;};
volatile U u;
std::memset(u, 0, sizeof u);
u.a = 1.0f;
std::cout << u.b;
There is no other threads. But compiler does not know that there is no other threads!
From compiler point of view, nothing changed! And if third example was well-defined, last one must be well-defined too!
And we don't need that second thread because it does not change u
memory anyway.
If
volatile
is used, compiler assumes that u
can be modified silently at any point. At such modification any field can become active.
And so, compiler can never track what field of volatile union is active. It can't assume that a field remains active after it was assigned to (and that other fields remain inactive), even if nothing really modifies that union.
And so, in last two examples compiler shall give me exact bit representation of 1.0f
converted to int
.
The questions are: Is my reasoning correct? Are 3rd and 4th examples really well-defiend? What the standard says about it?