4

We all know that type punning like this

union U {float a; int b;};

U u;
std::memset(u, 0, sizeof u);
u.a = 1.0f;
std::cout << u.b;

is undefined behavior in C++.

It is undefined because after u.a = 1.0f; assignment .a becomes an active field and .b becomes inactive field, and it's undefined behavior to read from an inactive field. We all know this.


Now, consider following code
union U {float a; int b;};

U u;
std::memset(u, 0, sizeof u);
u.a = 1.0f;

char *ptr = new char[std::max(sizeof (int),sizeof (float))];
std::memcpy(ptr, &u.a, sizeof (float));
std::memcpy(&u.b, ptr, sizeof (int));

std::cout << u.b;

And now it becomes well-defined, because this kind of type punning is allowed. Also, as you see, u memory remains same after memcpy() calls.


Now let's add threads and the volatile keyword.
union U {float a; int b;};

volatile U u;
std::memset(u, 0, sizeof u);
u.a = 1.0f;

std::thread th([&]
{
    char *ptr = new char[sizeof u];
    std::memcpy(ptr, &u.a, sizeof u);
    std::memcpy(&u.b, ptr, sizeof u);
});
th.join();

std::cout << u.b;

The logic remains same, but we just have second thread. Because of the volatile keyword code remains well-defined.

In real code this second thread can be implemented through any crappy threading library and compiler can be unaware of that second thread. But because of the volatile keyword it's still well-defined.


But what if there is no other threads?
union U {float a; int b;};

volatile U u;
std::memset(u, 0, sizeof u);
u.a = 1.0f;
std::cout << u.b;

There is no other threads. But compiler does not know that there is no other threads!

From compiler point of view, nothing changed! And if third example was well-defined, last one must be well-defined too!

And we don't need that second thread because it does not change u memory anyway.


If volatile is used, compiler assumes that u can be modified silently at any point. At such modification any field can become active.

And so, compiler can never track what field of volatile union is active. It can't assume that a field remains active after it was assigned to (and that other fields remain inactive), even if nothing really modifies that union.

And so, in last two examples compiler shall give me exact bit representation of 1.0f converted to int.


The questions are: Is my reasoning correct? Are 3rd and 4th examples really well-defiend? What the standard says about it?
timrau
  • 22,578
  • 4
  • 51
  • 64
HolyBlackCat
  • 78,603
  • 9
  • 131
  • 207
  • `volatile` has nothing to do with threads. See [Why does volatile exist?](http://stackoverflow.com/questions/72552/why-does-volatile-exist?rq=1) – Bo Persson Oct 17 '15 at 11:38
  • First of all, in your forth example you read a variable without synchronization, thus the compiler can assume no other thread writes to it. `volatile` only means that every operation on the variable is a observable side effect; that is unrelated to multithreaded execution. So that example is definitely wrong. And second: Why are you doing such things in the first place? – Baum mit Augen Oct 17 '15 at 11:39
  • @BaummitAugen 1. I do such thing because it looks like a hack for easy type punning in C++. 2. Do you mean that when `compiler` sees `th.join()` it assumes that there is a possibility that the voltaile variable was modified? But what if I use any nonstandard threading library, for example SDL threads? Compiler has no knowledge that `SDL_WaitThread()` joins a thread. 3. `volatile only means that every operation on the variable is a observable side effect` Can you explain this? I don't understand what you mean. – HolyBlackCat Oct 17 '15 at 11:48
  • @HolyBlackCat It does not need to know that because it knows that you passed the variable to some function by reference, or its address, else the new thread could not modify it. So it can deduce that it might have changed from that. – Baum mit Augen Oct 17 '15 at 11:51
  • Besides, the second example seems wrong too as you are reading uninitialized bytes if `sizeof(float) < sizeof(int)`. – Baum mit Augen Oct 17 '15 at 11:53
  • And regarding the `volatile`: The compiler may not optimize away any read nor write to a `volatile` nor reorder them. – Baum mit Augen Oct 17 '15 at 11:57
  • The compiler *can* know about threads. If there were other threads, you would use things like `std::mutex` or `std::atomic`. Or call some functions from your secret threading library. Now you only use `volatile` which could possibly mean accessing memory mapped hardware. But the compiler knows that `u` isn't memory mapped, because it has allocated the variable itself. – Bo Persson Oct 17 '15 at 11:58
  • Nice try with the edit, but your `char` array would still be partially uninitialized. I hope you realize that your are on shaky grounds here. – Baum mit Augen Oct 17 '15 at 11:59
  • @BaummitAugen I've edited my post and added `memset()` to get rid of uninitialized bytes. `And regarding the volatile: The compiler may not optimize away any read nor write to a volatile nor reorder them.` Does that also mean that compiler assumes that `volatile` variable can be silently modified at any point? If so, why would 4th example be undefined? /// UPD: Oops. Another edit. – HolyBlackCat Oct 17 '15 at 12:00
  • As I said above, you write and read from a variable without any synchronization. If another thread would write to or read it, you would have a race, and races are UB, and the compiler may assume that UB does not happen. Making a variable `volatile` does not make it atomic. Btw, in the third case, you do not need the volatile. And your second example still lacks initialization. – Baum mit Augen Oct 17 '15 at 12:05
  • @BaummitAugen Ohhh, now I understand what you're talking about. @BoPersson Does that mean that `volatile` only makes sense when applied to a pointer to memory that was not allocated by the compiler? – HolyBlackCat Oct 17 '15 at 12:06
  • @HolyBlackCat: It is *possible* that `volatile` means something on compiler-allocated memory. Some compilers treat reads and writes to any known-to-be volatile storage as having acquire and release semantics. The point is that those compilers are not *required* to do so by the standard. And even if they do, the semantics imposed by a particular compiler may be subtly different from your expectation. For example, even with acquire-release semantics it is not necessarily the case that all threads will observe a consistent ordering of reads and writes. Read your compiler documentation. – Eric Lippert Oct 17 '15 at 17:03

2 Answers2

9

In real code this second thread can be implemented through any crappy threading library and compiler can be unaware of that second thread. But because of the volatile keyword it's still well-defined.

That statement is false, and so the rest of the logic upon which you base your conclusion is unsound.

Suppose you have code like this:

int* currentBuf = bufferStart;
while(currentBuf < bufferEnd)
{
    *currentBuf = foobar;    
    currentBuf++;
}

If foobar is not volatile then a compiler is permitted to reason as follows: "I know that foobar is never aliased by currentBuf and therefore does not change within the loop, therefore I may optimize the code as"

int* currentBuf = bufferStart;
int temp = foobar;
while(currentBuf < bufferEnd)
{
    *currentBuf = temp;    
    currentBuf++;
}

If foobar is volatile then this and many other code generation optimizations are disabled. Notice I said code generation. The CPU is entirely within its rights however to move reads and writes around to its heart's content, provided that the memory model of the CPU is not violated.

In particular, the compiler is not required to force the CPU to go back to main memory on every read and write of foobar. All it is required to do is to eschew certain optimizations. (This is not strictly true; the compiler is also obliged to ensure that certain properties involving long jumps are preserved, and a few other minor details that have nothing to do with threading.) If there are two threads, and each is on a different processor, and each processor has a different cache, volatile introduces no requirement that the caches be made coherent if they both contain a copy of the memory for foobar.

Some compilers may choose to implement those semantics for your convenience, but they are not required to do so; consult your compiler documentation.

I note that C# and Java do require acquire and release semantics on volatiles, but those requirements can be surprisingly weak. In particular, the x86 will not reorder two volatile writes or two volatile reads, but is permitted to reorder a volatile read of one variable before a volatile write of another, and in fact the x86 processor can do so in rare situations. (See http://blog.coverity.com/2014/03/26/reordering-optimizations/ for a puzzle written in C# that illustrates how low-lock code can be wrong even if everything is volatile and has acquire-release semantics.)

The moral is: even if your compiler is helpful and does impose additional semantics on volatile variables like C# or Java, it still may be the case that there is no consistently observed sequence of reads and writes across all threads; many memory models do not impose this requirement. This can then cause weird runtime behaviour. Again, consult your compiler documentation if you want to know what volatile means for you.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
1

No - your reasoning is wrong. The volatile part is a general misunderstanding - volatile is not working as you state.

The union part is as well wrong. Read this Accessing inactive union member and undefined behavior?

With c++ (11) you can only expect correct/well defined behaviour when the last write correspond to the next read.

Community
  • 1
  • 1
Support Ukraine
  • 42,271
  • 4
  • 38
  • 63