12

Are there any C or C++ compilers out there that implement "aggressive" memory consistency model for volatile variables? By "aggressive" consistency model I mean accompanying all writes to volatile variables with memory barriers in generated code.

AFAIK, this is customary behavior for C or C++ compilers on IA64 (Itanium) platform. What about x86? Is there a compiler out there that implements (or can be configured to implement) Itanium-like approach to handling volatile variables on x86 platform?

Edit: I'm looking at the code VS 2005 generates (after reading the comments) and I don't see anything that would resemble any sort of memory barrier when accessing volatile variables. This is perfectly fine to ensure memory consistency on a single-CPU multi-core x86 platform, because of MESIF (Intel) and MOESI (AMD) cache protocols.

However, this seems to be insufficient on a multi-CPU SMP x86 platform. An SMP platform would require memory barriers in the generated code to ensure the memory consistency between CPUs. What am I missing? What exactly does Microsoft mean when they claim that they already have acquire-release semantics on volatile variables?

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • 3
    [According to Raymond Chen](https://blogs.msdn.com/b/oldnewthing/archive/2011/04/19/10155452.aspx?Redirected=true) you get this behavior with VS2005 and newer – Praetorian Jun 27 '12 at 19:42
  • 2
    @Prætorian : [According to the official documentation](http://msdn.microsoft.com/en-us/library/12a04hfd.aspx) as well. ;-] – ildjarn Jun 27 '12 at 19:52
  • @AndreyT : Are you testing VC++ 2005 or VC++ 2005 SP1? IIRC, VC++ 2005 RTM had a bug where `volatile` did not have the expected semantics, which was fixed in SP1 and VC++ 2008+. – ildjarn Jun 27 '12 at 20:05
  • Related: http://stackoverflow.com/a/4558031/241536 – John Dibling Jun 27 '12 at 20:28
  • @John Dibling: Related, but makes generalized assertions that don't really help me here. "It does not create memory fences"? Sorry, but that makes no sense. It is up to compilers to decide what they will create or not in response to `volatile`. And some compilers are known to create explicit memory fences for `volatile` access. This is what my question is about. – AnT stands with Russia Jun 27 '12 at 20:32
  • @AndreyT: That's why I posted this as a comment, and did not answer "can't do it." My link is more a warning to future readers. – John Dibling Jun 27 '12 at 20:33
  • 4
    This statement is not correct: "However, this is insufficient on a multi-CPU SMP x86 platform." Whether all in one chip or not, the physical packaging of cores does not change the software memory ordering model in x86, at least for Intel platforms. – srking Jun 27 '12 at 21:01
  • @srking: Yes, I realize that now. Thank you. – AnT stands with Russia Jun 28 '12 at 00:43

1 Answers1

2

It should be noted that x86 CPUs reorder neither loads with other loads nor stores with other stores. As such, no explicit barriers are necessary.

The MSVC compiler will ensure that loads are not reordered with volatile loads and stores are not reordered with volatile stores (I'm now talking about reordering load and store instructions, of course), thus guaranteeing acquire and release semantics for volatile loads and stores respectively.

avakar
  • 32,009
  • 9
  • 68
  • 103
  • Is that true even for multi-CPU case (as opposed to single-CPU multi-core case)? – AnT stands with Russia Jun 27 '12 at 20:34
  • @AndreyT, the order in which the external bus sees the loads is the same as the order of load instructions. The same is true for stores. As you noted, cache protocols ensure coherency. In other words, if CPU1 performs a store S and CPU2 sees that store with its load L, then loads by CPU2 following L will see all stores by CPU1 prior to S. And that, my friend, are acquire/release semantics :) – avakar Jun 27 '12 at 20:40
  • @AndreyT, I don't mean to rant, but I wish the C++ committee had made the "volatile has acquire/release semantics" guarantee standard instead of coming up with ``. – avakar Jun 27 '12 at 20:46
  • 4
    @avakar: I don't. `` offers stronger guarantees than acquire/release, such as sequential consistency, and `volatile` has other uses- it is *not* supposed to be for threading. – Puppy Jun 27 '12 at 20:46
  • @DeadMG, I'm sorry I wasn't clear, I meant to say that I find acq/rel semantics for volatile more important that and the committee should have invested their time in that before atomic. Again, it was a rent, I couldn't help myself. :) – avakar Jun 27 '12 at 21:02
  • 2
    @avakar wouldn't that mean a performance hit for a program running on a device with weak cache coherency and using volatile to access control registers? Wouldn't it have to take time to maintain cache coherency even if it doesn't actually need that? Seems reasonable not to mix the two sets of semantics. – bames53 Jun 27 '12 at 21:18