0

I've been in a debate about a corner case regarding local variables in a multithread environment.

The question is regarding programs formed like:

std::mutex mut;

int main()
{
    std::size_t i = 0;
    doSomethingWhichMaySpawnAThreadAndUseTheMutex();
    mut.lock();
    i += 1;        // can this be reordered?
    mut.unlock();
    return i;
}

The question revolves around whether the i += 1 can be reordered to occur above the mutex locking.

The obvious parts are that mut.lock() happens-before i += 1, so if any other thread might be able to observe the value of i, the compiler is obliged to not have incremented it. From 3.9.2.3 of the C++ spec, "If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained."" This means that if I used any means to get a pointer to i, I can expect to see the right value.

However, the spec does state that the compiler may use the "as-if" rule to not give an object a memory address (footnote 4 on section 1.8.6). For example, i could be stored in a register, which has no memory address. In such a case, there would be no memory address to point to, so the compiler could prove that no other thread could access i.

The question I am interested in is what if the compiler does not do this "as-if" optimization, and does indeed store the object. Is the compiler permitted to store i, but do reordering as-if i was not actually stored? From an implementation perspective, this would mean that i might be stored on a stack, and thus it would be possible to have a pointer point at it, but have the compiler assume nobody can see i, and do the re-order?

curiousguy
  • 8,038
  • 2
  • 40
  • 58
Cort Ammon
  • 10,221
  • 31
  • 45
  • There is a [huge change](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0137r1.html) for pointer in C++17. The sentence you referenced is now deleted. Now the compiler can prove other threads cannot access `i` legally. – xskxzr Jun 07 '18 at 14:58
  • @xskxzr Wow, that is a huge change! – Cort Ammon Jun 07 '18 at 17:11
  • **There is no such thing as a specific rule called the "as-if rule".** – curiousguy Jun 08 '18 at 01:51
  • 2
    @curiousguy Can you elaborate? [This](http://en.cppreference.com/w/cpp/language/as_if) and [this](https://stackoverflow.com/questions/15718262/what-exactly-is-the-as-if-rule) source lead me to believe there is an "as-if" rule. At the very least, the term is common enough to be meaningful in discussing the language. – François Andrieux Jun 08 '18 at 14:11
  • @FrançoisAndrieux Well applying *only* "transformations that do not change the observable behavior" just means: *correctly* implementing the standard. Do you write `if(boolean)` or `if(boolean==true)`? Do you prefer `if(a==b)` or `if((a==b) == true)`? Saying there is a rule implies that there is a parallel universe in the multiverse where there is no such rule, and I wonder how it looks like. – curiousguy Jun 08 '18 at 17:37
  • @curiousguy Saying the as-if rule is just there for implicit conversion is not at all true. That's not what it does, and it's not what allows for these conversions. The as-if rule means drastically different legal assemblies that can be generated from the same source code. I'm not sure I understand where you're trying to go with your comment. – François Andrieux Jun 08 '18 at 18:15
  • @FrançoisAndrieux In my comment I assumed that boolean was actually a boolean. There is no implicit conversion. I could have used an English example: do you say that "water is liquid under standard conditions" or that "saying that 'water is liquid under standard conditions' is true"? They mean exactly the same thing. – curiousguy Jun 08 '18 at 18:26
  • @curiousguy Right, I misspoke. But I don't see what this has to do with the as-if rule. – François Andrieux Jun 08 '18 at 18:29
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/172780/discussion-between-curiousguy-and-francois-andrieux). – curiousguy Jun 08 '18 at 18:30
  • "_might be able_" Instead of writing that code might be able, provide explicit code that you believe is able to do such and such. – curiousguy Jun 08 '18 at 20:01
  • @xskxzr "_There is a huge change for pointer in C++17._" that changes exactly nothing (there were always restrictions on how so called pointer "values" could be used - there is no such thing as a pointer value) and isn't even applicable to that question in any way shape or form. – curiousguy Jun 08 '18 at 20:04
  • @curiousguy Where does the standard forbids codes like `int *p = reinterpret(42); int i = *p; ` (assume there happens to be an object with automatic storage duration located at the address 42) before the change? – xskxzr Jun 09 '18 at 04:12
  • @xskxzr OK some objects that are documented as residing at a given address can be addressed w/o the address of operator. No automatic object is such. – curiousguy Jun 09 '18 at 18:14

3 Answers3

2

The compiler is allowed to perform optimizations as long as the observable results of program execution legitimately could have been obtained ("as-if") without those optimizations.[1] So this question uses "as-if" in a misleading manner, if not actually asking a backwards question:

Is the compiler permitted to store i, but do reordering as-if i was not actually stored?

This asks if the compiler is permitted to do things as long as the results of program execution could have been obtained with an optimization. That is not the question to ask. The question should use non-optimized behavior as the reference. So something more like: "Is the compiler permitted to re-order the statements?" The answer is yes, as long as the observable results do not change. Nothing external to this particular function is told how to access i, so the compiler should be allowed to implement the increment anywhere between the surrounding uses of it (specifically: its definition and the return statement).

That being said, what I would expect a compiler to do in this case is neither give i a memory address nor treat it as a register variable. I would expect the compiler to treat it like a constant, effectively changing your function to:

int main()
{
    doSomethingWhichMaySpawnAThreadAndUseTheMutex();
    mut.lock();
    mut.unlock();
    return 1;
}

This is allowed as long as you have no way to detect that it has been done (short of examining the machine code directly).

Note:
[1] The use of "could have been" is an acknowledgement that there are portions of the C++ specification that use the word "unspecified". These portions allow compilers to make choices that (when dealing with non-robust code) could change observable behavior. That is, there can be a set of allowed behaviors, rather than a single allowed behavior. As long as the results remain in this set, an optimization is allowed.

JaMiT
  • 14,422
  • 4
  • 15
  • 31
  • "_match the optimized version_" is not a thing (in general). C/C++ are not deterministic. A run of a program can produce many outputs (usually). – curiousguy Jun 08 '18 at 01:56
  • 2
    @curiousguy C and C++ (there is no C/C++) are deterministic to the extent that the standard describes them. For a given input (including such things as the source of entropy used to seed pseudo-random generators as input, network packets, the current time, etc.) a well formed program will be deterministic in terms of observable behavior. If C++ was non-deterministic, all of this would be a huge waste of time. – François Andrieux Jun 08 '18 at 14:15
  • @FrançoisAndrieux "_deterministic to the extent that the standard describes them_" Programs in almost all (except Prolog, Haskell) practically used languages can show non determinism; but well written programs are either fully deterministic (can have exactly one behavior for each inputs) or the non determinism is limited to unimportant behaviors (like logging), or the program is by design non deterministic (like Monte Carlo, but then you could say that's parameterized deterministic behavior). – curiousguy Jun 08 '18 at 19:25
  • "`mut.lock(); mut.unlock();`" There is nothing after that empty locking that can synchronize with another thread, and then `exit(1)` will destroy global variables, so it's clear that no thread can use `mut` after. It means that another threads cannot use `mut` at all, so nobody else is synchronizing on `mut`, so `mut` can be ignored by the compiler. – curiousguy Jun 08 '18 at 19:37
  • "_as long as the observable results do not change_" In a non deterministic language, that's a very misleading statement. – curiousguy Jun 08 '18 at 20:00
  • This is not the place to debate "deterministic", so I will simply state that C++ is deterministic as I understand the word and leave it at that. – JaMiT Jun 13 '18 at 23:14
  • @JaMiT "Certain other aspects and operations of the abstract machine are described in this document as unspecified (for example, order of evaluation of arguments in a function call ([expr.call])). Where possible, this document defines a **set of allowable behaviors**. These define the **nondeterministic aspects** of the abstract machine. An instance of the abstract machine can thus have more than one possible execution for a given program and a given input." http://eel.is/c++draft/intro.compliance#intro.abstract-3 – curiousguy Jun 15 '18 at 18:01
  • @FrançoisAndrieux Are you saying that C++ and C are unrelated languages with no useful overlap? – curiousguy Jun 15 '18 at 18:10
  • @curiousguy https://en.wikipedia.org/wiki/Straw_man That's not at all what I said. – François Andrieux Jun 15 '18 at 18:12
  • @FrançoisAndrieux How should the useful overlap of C and C++ be called? – curiousguy Jun 15 '18 at 18:21
  • @FrançoisAndrieux "C and C++" is commonly understood as "the subset of valid C programs and valid C++ programs whose semantic are somewhat 'equivalent' or covered by isomorphic language rules"? – curiousguy Jun 15 '18 at 18:27
  • @curiousguy That's awfully specific, and no. "C and C++" would refer to a pair of languages. But if you are leading on that that's what "C/C++" should mean, yeah maybe it should. I don't know. What I do know is how it's usually (incorrectly) used to refer to some fictitious unified language, as it's used by people who don't know anything about programming, such as managers discussing software development related positions. Maybe my comment would have been more clear if I had written "there is no C/C++ language" rather than "there is no C/C++". – François Andrieux Jun 15 '18 at 18:35
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/173238/discussion-between-francois-andrieux-and-curiousguy). – François Andrieux Jun 15 '18 at 18:35
  • @Jamit _C++ is deterministic..._ - not in my code :) – Paul Sanders Jun 15 '18 at 19:22
  • Hmm... I wonder if all this nonsense is because I miswrote a part of my answer. That's one problem with language lawyers -- they can get so hung up on proving themselves right in a literal sense that they fail to put any effort into understanding concepts. In any event, I'll edit my answer to better express what I meant. – JaMiT Jun 15 '18 at 20:49
  • 1
    @JaMiT: I would suggest as a refinement the phraseology "the observable results of program execution *are among those that could legitimately be produced by* a non-optimized executable". If a non-optimized executable would be allowed to produce any of four different outputs, chosen in Unspecified fashion, an optimized executable would be allowed to produce any of those outputs, chosen in an Unspecified fashion that may or may not be the same as the unoptimized version. – supercat Jun 15 '18 at 22:59
  • 1
    @supercat thanks. That is the direction I was headed in when I changed "*the* non-optimized version" to "*a* non-optimized version". I added your suggestion as a footnote (because I don't want the intro to be too verbose), then I thought a bit and did some further revisions. Hopefully it's looking better now. – JaMiT Jun 16 '18 at 03:21
  • @JaMiT: Nice change. I do think there had historically been significant value in recognizing the range of behaviors that could be produced by a non-optimizing compiler for a particular target *that didn't go out of its way to do something weird*. While there is some ambiguity as to what would be considered "going out of its way", in many cases there's a limit to how bizarre a behavior could be and still plausibly satisfy such a definition. In cases where any remotely-plausible behavior would satisfy requirements, code that can exploit that may be more efficient than code that must... – supercat Jun 16 '18 at 17:34
  • ...avoid UB at all costs. Unfortunately, even though on most platforms there are some pretty clear and useful limits to "natural" behaviors, and even though the Standard acknowledges the concept of UB behaving "in a documented manner characteristic of the environment", the notion that it's useful to let programmers exploit that seems to have become less fashionable than "clever" optimizations. – supercat Jun 16 '18 at 17:35
1

I find this question very muddled. With the code as posted, the compiler is obviously aware that the only use of i is in the return statement, so i will be optimised away, end of story. The mutex doesn't come into it.

But as soon as you take the address of i - and give it away to somebody else - the game changes. Now the compiler has to put a real variable on the stack and manipulate it only between mutex.lock() and mutex.unlock(). Doing anything else would alter the semantics of your program. The mutex also gives you a memory fence.

You can see this clearly at Godbolt.

Edit: I have fixed a silly bug in that code that rather obscured the point I was trying to make, sorry about that.

Paul Sanders
  • 24,133
  • 4
  • 26
  • 48
  • "_give it away to somebody else_" and lost track of who got a copy. Some uses of pointers can be followed and bound (the compiler knows how many copies there are), but as soon as someone gets out of the view with the copy it's game over: everyone else might have got a copy of the copy. **It's like secrets**: you can try to keep track of all copies of a classified document, but as someone you can't trust gets a copy, everyone else might have got it. There is no such thing as a classified information/pointer that can't be copied. – curiousguy Jun 08 '18 at 18:30
  • "_Doing anything else would alter the semantics of your program_" This is incorrect in that trivial example code. There is no well defined way for another thread to access `i`. – curiousguy Jun 08 '18 at 19:40
  • "_the game changes_" Even ignoring the essential fact that the function ends with `exit`, **nothing changes**. No other thread can access a local variable in a function of another thread that doesn't synchronize. – curiousguy Jun 08 '18 at 19:57
  • @curiousguy You voted me down. You don't understand my post and you voted me down. That really takes the biscuit. Did you even look at the code generated by the compiler at Godbolt? – Paul Sanders Jun 08 '18 at 21:13
  • What don't you understand in: "'as soon as you take the address of i - and give it away to somebody else - the game changes' is WRONG"? – curiousguy Jun 08 '18 at 21:14
  • @curiousguy My, that was fast. Did you? Look at the code? – Paul Sanders Jun 08 '18 at 21:16
  • Please provide the exact code such as "the game changes". – curiousguy Jun 08 '18 at 21:18
  • @curiousguy I did, it's on Godbolt, there's a link in my answer. If you take a look at the machine code, you will see (at line 83) that because I passed its address to another function, `j` lives on the stack and is incremented between `mutex.lock ()` and `mutex.unlock ()`. The compiler could try harder to determine if this is strictly necessary but it doesn't seem to bother (I see I made a small slip-up in my code there, see if you can spot it). `i`, on the other hand, is completely optimised away, because the compiler knows that no-else can see it. Does that help you see things more clearly? – Paul Sanders Jun 08 '18 at 21:39
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/172788/discussion-between-curiousguy-and-paul-sanders). – curiousguy Jun 08 '18 at 21:55
  • @curiousguy No way. – Paul Sanders Jun 08 '18 at 22:05
  • Can you please clarify "has to"? Is it an absolute? Or the description of a particular implementation? – curiousguy Jun 08 '18 at 22:15
  • "_Doing anything else would alter the semantics of your program_" What the heck is "the semantics of your program"? The program prints "j is (hex value)" and returns 1! – curiousguy Jun 08 '18 at 22:18
  • @curiousguy That's actually a decent question, I told you there was a mistake. Change `j` to `*j` in the body of `give_away_the_address_of_j()` and things should make a bit more sense. Sorry about that, I'll update the Godbolt link tomorrow. – Paul Sanders Jun 08 '18 at 22:27
  • @curiousguy Just saw your previous comment, sorry. The gist of it is that, once the address of `j` is passed to another fuction, the compiler (in general) doesn't know what that function will do with it so it has to play it safe, no tricky stuff optimising out `j`. As it happens, the code I posted never dereferences that address (I did mean it to but I screwed up) but as soon as I passed the address of `j` to another function the compiler gave up anyway. Interesting, no? I will fix the code tomorrow, it's a bit sloppy in general. – Paul Sanders Jun 08 '18 at 22:38
  • @curiousguy Code at Goldbolt updated, please follow the new link. Thx. – Paul Sanders Jun 09 '18 at 18:03
  • There are many issues with that code. It doesn't meaningfully synchronize with other threads. – curiousguy Jun 09 '18 at 18:19
  • @curiousguy It's not meant to. It's meant to address the OP's question, which has the slightly odd title "Can an object both be stored and not stored?". Answer of course: no. – Paul Sanders Jun 09 '18 at 18:29
  • 1) The OP's question is meaningless, as I wrote: a) it is predicated on the existence of a so called "as if rule"; b) the code is incorrect WRT other threads and the compiler could know that. 2) The question is not about GCC. The limitations of a particular compiler is not the same thing as the limitation of C++ compilers in general. – curiousguy Jun 09 '18 at 18:32
0

The whole sequence:

    mut.lock(); // i == 0 at that point
    i += 1;        // can this be reordered?
    mut.unlock(); // i == 1 at that point
    exit(i);
}

can be compiled as just exit(1); and other threads can be ignored as there is no proper synchronisation.

You need to wait for other threads to terminate, or for them to engage in doing nothing ever again. You don't do that so it can be assumed that all other threads are doing nothing.

The mutex has no meaningful role here.

curiousguy
  • 8,038
  • 2
  • 40
  • 58