Are volatile reads and writes atomic on Windows+VisualC?

Question

There are a couple of questions on this site asking whether using a volatile variable for atomic / multithreaded access is possible: See here, here, or here for example.

Now, the C(++) standard conformant answer is obviously no.

However, on Windows & Visual C++ compiler, the situation seems not so clear.

I have recently answered and cited the official MSDN docs on volatile

Microsoft Specific

Objects declared as volatile are (...)

A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object^? that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary.

A read of a volatile object (volatile read) has Acquire semantics; a reference to a global or static object^? that occurs after a read of volatile memory in the instruction sequence will occur after that volatile read in the compiled binary.

This allows volatile objects to be used for memory locks and releases in multithreaded applications.

_{[emphasis mine]}

Now, reading this, it would appear to me that a volatile variable will be treated by the MS compiler as std::atomic would be in the upcoming C++11 standard.

However, in a comment to my answer, user Hans Passant wrote "That MSDN article is very unfortunate, it is dead wrong. You can't implement a lock with volatile, not even with Microsoft's version. (...)"

Please note: The example given in the MSDN seems pretty fishy, as you cannot generally implement a lock without atomic exchange. (As also pointed out by Alex.) This still leaves the question wrt. to the validity of the other infos given in this MSDN article, especially for use cases like here and here.)

Additionally, there are the docs for The Interlocked* functions, especially InterlockedExchange with takes a volatile(!?) variable and does an atomic read+write. (Note that one question we have on SO -- When should InterlockedExchange be used? -- does not authoritatively answer whether this function is needed for a read-only or write-only atomic access.)

What's more, the volatile docs quoted above somehow allude to "global or static object", where I would have thought that "real" acquire/release semantics should apply to all values.

Back to the question

On Windows, with Visual C++ (2005 - 2010), will declaring a (32bit? int?) variable as volatile allow for atomic reads and writes to this variable -- or not?

What is especially important to me is that this should hold (or not) on Windows/VC++ independently of the processor or platform the program runs on. (That is, does it matter whether it's a WinXP/32bit or a Windows 2008R2/64bit running on Itanum2?)

Please back up your answer with verifiable information, links, test-cases!

Barrier semantics do not imply atomicity of instruction sequences. In particular the sequence load, add, store isn't atomic. — Alexandre C., Aug 10 '11 at 07:47
@Alex : The question isn't about sequences, only about a single read or write. — Martin Ba, Aug 10 '11 at 08:03
@Martin: Even in that case, it depends on what the hardware provides for you. VS will not add a `lock` around `volatile` accesses, and that means that it will only be atomic if the underlying processor instruction would be atomic. I.e. it will not be for an unaligned integer type, or for a 64bit integer in a 32bit platform, for example. — David Rodríguez - dribeas, Aug 10 '11 at 08:16
@David : Well I guess that's implicitly part of the question. Without force-casting, can I even have an unaligned volatile integer type? *If you feel that the MSDN docs are misleading or wrong, please add an answer explaining in which legitimate cases the atomicity of volatile breaks down. Cheers!* — Martin Ba, Aug 10 '11 at 08:28
@Martin: In a 32bit windows OS, with a `uint32_t` `volatile` reference, accesses are not atomic. Note that the MSDN part that you quote does not say that it guarantees atomicity of the operation, it just says that the semantics it provides might be sufficient to be used for memory locks. I wish I had more time to write a full answer. — David Rodríguez - dribeas, Aug 10 '11 at 08:43
@David: please do, it would be nice to get a full answer so this can be clarified. — murrekatt, Aug 10 '11 at 10:31
Uhm... I just noticed that my last comment is wrong... the variable would be `uint64_t` for the operation not to be atomic. Intel/AMD guarantee atomicity of 32bit reads/writes in 32bit mode, and atomicity of 64bit reads/writes in 64bit mode. — David Rodríguez - dribeas, Aug 10 '11 at 11:52

jcoder · Accepted Answer · 2011-08-10T08:17:10.370

7

Yes they are atomic on windows/vc++ (Assuming you meet alignment requirements etc or course)

However for a lock you would need an atomic test and set, or compare and exchange instuction or similar, not just an atomic update or read.

Otherwise there is no way to test the lock and claim it in one indivisable operation.

EDIT: As commented below, all aligned memory accesses on x86 of 32bit or below are atomic anyway. The key point is that volatile makes the memory accesses ordered. (Thanks for pointing this out in the comments)

edited Aug 10 '11 at 08:17

answered Aug 10 '11 at 07:50

jcoder

29,554
19
87
130

"Assuming you meet alignment requirements etc or course" - are you able to elaborate on that? Would the compiler make sure alignment requirements are met for simple 32/64 bit types, or is some additional measure needed? – Martin Ba Aug 10 '11 at 08:05
Yes there should be no problem at all in normal usage, you'd have to explicitly write something strange to prevent the alignment being suitable. (Was just pointing out that if you contrived using #pragma pack or dodgy casting to have an unaligned variable then volatile couldn't do much to help) – jcoder Aug 10 '11 at 08:07
3

"Assuming you meet the alignment requirements" will make the access atomic in the intel platform, regardless of the OS. Atomicity is not related to volatility. The extra constraints in the VS C++ compiler only affect the ordering of the instructions and a memory fence that ensures that neither the compiler nor the CPU will perform specific reordering of instructions. And there is no guarantee of atomic read/write to a 64bit int in a 32bit platform. – David Rodríguez - dribeas Aug 10 '11 at 08:13
Yes, that's what I meant although it's not quite what I said. I shall edit my comment to avoid confusion. – jcoder Aug 10 '11 at 08:15

score 3 · Answer 2 · answered Aug 10 '11 at 07:56

3

As of Visual C++ 2005 volatile variables are atomic. But this only applies to this specific class of compilers and to x86/AMD64 platforms. PowerPC for example may reorder memory reads/writes and would require read/write barriers. I'm not familar what the semantics are for gcc-class compilers, but in any case using volatile for atomics is not very portable.

reference, see first remark "Microsoft Specific": http://msdn.microsoft.com/en-us/library/12a04hfd%28VS.80%29.aspx

answered Aug 10 '11 at 07:56

Tobias Schlegel

3,970
18
22

"only ... to x86/AMD64" - this would mean a program running on a Windows on Itanium/IA-64 couldn't rely on this atomicity? – Martin Ba Aug 10 '11 at 08:07
I believe Itanium has the same behavior for aligned read/write. I don't know about ARM, though, which is another Windows platform. – Cory Nelson Aug 10 '11 at 09:09

score 1 · Answer 3 · answered Aug 10 '11 at 08:30

1

A bit off-topic, but let's have a go anyway.

... there are the docs for The Interlocked* functions, especially InterlockedExchange which takes a volatile(!) variable ...

If you think about this:

void foo(int volatile*);

Does it say:

the argument must be a pointer to a volatile int, or
the argument may as well be a pointer to a volatile int?

The latter is the correct answer, since the function can be passed both pointers to volatile and non-volatile int's.

Hence, the fact that InterlockedExchangeX() has its argument volatile-qualified does not imply that it must operate on volatile integers only.

answered Aug 10 '11 at 08:30

Maxim Egorushkin

131,725
17
180
271

Excellent information! The only thing I don't get is why the Interlocked* function then require pointers to volatile qualified variables. :-) – Martin Ba Aug 10 '11 at 08:42
This answer: http://stackoverflow.com/questions/6397662/if-volatile-is-useless-for-threading-why-do-atomic-operations-require-pointers-t/6397753#6397753 explains why the arguments are marked volatile. – Martin Ba Aug 10 '11 at 08:53
That was last month's answer. – Maxim Egorushkin Aug 10 '11 at 08:56

Alexandre C. · Answer 4 · 2011-08-10T08:54:19.277

1

The point is probably to allow stuff like

singleton& get_instance()
{
    static volatile singleton* instance;
    static mutex instance_mutex;

    if (!instance)
    {
        raii_lock lock(instance_mutex);

        if (!instance) instance = new singleton;
    }

    return *instance;
}

which would break if instance was written to before initialization was complete. With MSVC semantics, you are guaranteed that as soon as you see instance != 0, the object has finished being initialized (which is not the case without proper barrier semantics, even with traditional volatile semantics).

This double-checked lock (anti-)pattern is quite common actually, and broken if you don't provide barrier semantics. However, if there are guarantees that accesses to volatile variables are acquire + release barriers, then it works.

Don't rely on such custom semantics of volatile though. I suspect this has been introduced not to break existing codebases. In any way, don't write locks according to MSDN example. It probably doesn't work (I doubt you can write a lock using just a barrier: you need atomic operations -- CAS, TAS, etc -- for that).

The only portable way to write the double-checked lock pattern is to use C++0x, which provides a suitable memory model, and explicit barriers.

edited Aug 10 '11 at 08:54

answered Aug 10 '11 at 08:47

Alexandre C.

55,948
11
128
197

good example. Tobias writes in his answer "only applies ... to x86/AMD64", so it's still unclear to me if these semantics hold on all hardware platforms where Windows runs. – Martin Ba Aug 10 '11 at 08:50
1

This pattern is broken. It's true that Microsoft, at one point, was in the process of extending the semantics of `volatile` to make this work; I don't know how far along they actually got, however; they proposed this extension to the standards committee, and after some discussion, their representative there recognized that it wasn't a good idea, and withdrew the proposal. Outside of the Microsoft world, of course, this simply doesn't work; the compiler can (and most probably do in some cases) move writes within the constructor after the write to the volatile. – James Kanze Aug 10 '11 at 08:58
1

@James: indeed. What I read in the MSDN page about volatile is that the write to `instance` will happen after the construction, regardless of compiler (and CPU!) reordering, and reads from `instance` will also behave correctly. In C++0x, you would use `std::atomic` (either with default full barrier semantics, or more specific acquire or release) – Alexandre C. Aug 10 '11 at 09:10
@Alexandre C Yes. Basically, Microsoft's proposal was to give `volatile` the semantics of `std::atomic<>`. The reason the committee didn't go along with this is that it added too much overhead to `volatile` for its existing uses (typically, memory mapped IO). – James Kanze Aug 10 '11 at 09:42

Necrolis · Answer 5 · 2011-08-10T11:28:33.363

1

under x86, these operations are guaranteed to be atomic without the need for LOCK based instructions such as Interlocked* (see intel's developer manuals 3A section 8.1):

basic memory operations will always be carried out atomically:

• Reading or writing a byte

• Reading or writing a word aligned on a 16-bit boundary

• Reading or writing a doubleword aligned on a 32-bit boundary

The Pentium processor (and newer processors since) guarantees that the following additional memory operations will always be carried out atomically:

• Reading or writing a quadword aligned on a 64-bit boundary

• 16-bit accesses to uncached memory locations that fit within a 32-bit data bus

The P6 family processors (and newer processors since) guarantee that the following additional memory operation will always be carried out atomically:

• Unaligned 16-, 32-, and 64-bit accesses to cached memory that fit within a cache line

This means volatile will only every serve to prevent caching and instruction reordering by the compiler (MSVC won't emit atomic operations for volatile variables, they need to be explicitly used).

edited Aug 10 '11 at 11:28

answered Aug 10 '11 at 08:55

Necrolis

25,836
3
63
101

"means volatile will only ever serve to prevent caching and instruction reordering by the compiler" -- then you are stating that the MSDN docs are lying? – Martin Ba Aug 10 '11 at 09:00
1

@Martin: I'm stating that they are misleading, as they rely on x86 hardware for the atomic part (I've check that they don't emit atomic ops for dealing with `volatile` qualified variables) – Necrolis Aug 10 '11 at 09:09
thanks - that's what I also should do - check the emitted code :-) – Martin Ba Aug 10 '11 at 09:20
Hmm ... from JohnB's answer I now understand that the *ordering* part of the VC++ volatile specs seems also relevant. WRT. to atomicity - Windows currently runs on x86, x64(AMD64) and Itanum(IA-64) - of these platforms, which one doesn't guarantee atomic aligned access? – Martin Ba Aug 10 '11 at 09:26
1

@Martin: windows on some mobile devices might suffer from not having the aligned atomic read/write, however, if you look into windows sync primitives, you'll see that they all use `LOCK` based instructions, so either they don't trust their own docs, or the atomicity may be only on a *per-core* basis. – Necrolis Aug 10 '11 at 11:28

Are volatile reads and writes atomic on Windows+VisualC?

Microsoft Specific

Back to the question

5 Answers5

Linked