Scope of not ordering instructios with memory barrier or lock

Question

I could not find information about what is the scope at which compiler stop doing instructions reordering when it sees a lock or memory barrier.

For e.g. in below pseudo-code with reference to c++ lang

Is there any difference in ordering of instructions on a,x in cases where lock is moved to separate function bar vs when it is with in function foo ?

Are the answers same if lock is replaced by a full memory barrier ?

int func foo()
{
  read a;
  x = 10;
  {
      lock(mutex);
      z++;
      unlock(mutex);
  }
  a += 1;
  return x;
}

vs

int func foo()
{
  read a;
  x = 10;
  bar();
  a += 1;
  return x;
}

func bar()
{
      lock(mutex);
      z++;
      unlock(mutex);
}

What does `lock(z)` mean in your pseudocode? Is that supposed to compile to an atomic transaction on `z`, like the x86 `lock` prefix (`lock add dword [z], 1`) or an LL/SC retry loop on some other ISAs? You haven't specified a language, but C++'s memory model respects memory-ordering things like `z.fetch_add(1, std::memory_order_seq_cst)` that might or might not exist in other functions; if it can't inline the function, it has to assume an external function call is a full barrier. [How does a mutex lock and unlock functions prevents CPU reordering?](https://stackoverflow.com/q/50951011) — Peter Cordes, Apr 08 '20 at 10:29
Possibly also related: [Is function call an effective memory barrier for modern platforms?](https://stackoverflow.com/q/10698253) although I think it's asking if a function call *is* itself a barrier, which it isn't if it can inline. — Peter Cordes, Apr 08 '20 at 10:30
@PeterCordes lock(z) is a mutual exclusive lock. I can change it to `lock(lock_object); z++; unlock(lock_object);` if that is clear. — Ashish Negi, Apr 09 '20 at 01:45
@Holger right. I have changed the pseudo-code to reflect the dependency. — Ashish Negi, Apr 09 '20 at 01:49
@Holger: If this hypothetical language allows other threads to read vars without synchronization, then observers in other threads *could* notice that the `x = 10` store never happened after the `a += 1;`. (Assuming mutex lock and unlock have at least acquire and release semantics, that pair is enough to stop ops on opposite sides from crossing and reordering with each other.) C / C++ don't allow that for non-atomic vars, but I think Java's memory model is more forgiving. — Peter Cordes, Apr 09 '20 at 09:59
C/C++ would allow `x=10` or `a+=1` to reorder into the critical section and with `z`, though. https://preshing.com/20120913/acquire-and-release-semantics/ those can be only 1-way barriers. (But in practice mutex acquire is usually a stronger barrier, e.g. on x86 always a full barrier because it has to be an atomic RMW.) — Peter Cordes, Apr 09 '20 at 10:00
@PeterCordes I intentionally avoided the term “memory barrier” which does not exist in Java’s specification and did not promise anything here. All I said, is that the constraints, as far as they exist, won’t become weaker when being moved to another function. We could also say it the other way round, the constraints do not become stronger when they are written directly into the function like in the first example. It’s still possible that there are no constraints at all, depending on the context. Since this is only pseudo code, there’s nothing more that can be said. — Holger, Apr 09 '20 at 10:07
@Holger: Sorry, I was replying to your earlier *You wouldn’t notice any reordering of these entirely unrelated instructions anyway.* I had no problem with your last comment about moving stuff into functions, that might be the answer the OP wants, and should be true in any sanely-designed language. — Peter Cordes, Apr 09 '20 at 10:08
@PeterCordes my first comment addressed the question’s original code which had no increment and only unrelated reads and writes. Most of them could be reordered across the lock and unlock even when the other thread is locking correctly. The OP already [responded to it](https://stackoverflow.com/questions/61096786/scope-of-not-ordering-instructios-with-memory-barrier-or-lock?noredirect=1#comment108115850_61096786). — Holger, Apr 09 '20 at 10:23
@Holger: It had `x = 10` before and `y=11` after. If they are globally visible stores to shared vars, you could have another thread reading them in a loop, looking for the `y` store happening first (opposite of program order). Hmm, on 2nd thought I think I was wrong; both stores could reorder into the critical section and with each other even *with* mutex lock/unlock, if it was as weak as a pure acquire and pure release *operation* (not barrier). But not if `lock()` was replaced by a full memory barrier or even acq / rel barriers. An acquire *operation* isn't a fence. — Peter Cordes, Apr 09 '20 at 10:36
Thanks for your comments. I am interested in this question from C++ language point of view. — Ashish Negi, Apr 09 '20 at 17:39

Scope of not ordering instructios with memory barrier or lock

0 Answers0