4

If I have something like this...

volatile long something_global = 0;

long some_public_func()
{
    return something_global++;
}

Would it be reasonable to expect this code to not break (race condition) when accessed with multiple threads? If it's not standard, could it still be done as a reasonable assumption about modern compilers?

NOTE: ALL I'm using this for is atomic increment and decrement - nothing fancier.

Clark Gaebel
  • 17,280
  • 20
  • 66
  • 93

6 Answers6

15

No - volatile does not mean synchronized. It just means that every access will return the most up-to-date value (as opposed to a copy cached locally in the thread).

Post-increment is not an atomic operation, it is a memory access followed by a memory write. Interleaving two can mean that the value is actually incremented just once.

danben
  • 80,905
  • 18
  • 123
  • 145
  • No, the only difference between the two is which value is returned. Pre-increment is also a memory access followed by a memory write. – danben Jun 26 '10 at 01:54
  • 1
    It's not the end of the world to put a mutex lock/unlock around the operation. People fear the fact that a thread might pend now and then. But in this case it is an exceedingly short operation being protected and the probability of two threads reaching there at exactly the same time is small (but finite). – Amardeep AC9MF Jun 26 '10 at 02:07
  • The code is basically just a unique id factory. However, it's called often enough to make mutexing painful. Therefore, I was looking for a lock-free implementation. (This is on a server in case you're wondering). – Clark Gaebel Jun 26 '10 at 02:12
  • 2
    Amardeep is right - the memory barrier is the expensive part and it happens whether you use an atomic increment, or a lock protecting an increment. – Artelius Jun 26 '10 at 02:13
  • @wowus: did you measure the overhead? Another possibility here is to use `sem_t`, if you are only interested to increment atomically. For the read of it you still would have to protect it by a mutex, thought. – Jens Gustedt Jun 26 '10 at 06:43
  • 3
    @wowus: Why not just have your unique IDs be the thread ID concatenated with a thread-local counter? Then each thread can generate its own IDs without locking. – caf Jun 26 '10 at 08:27
  • caf's solution is a good one. And since you won't be sharing the counter, you avoid cache-line bouncing. – ninjalj Jun 26 '10 at 16:16
  • also with respect to atomics, it's coming in C1x, whenever that finishes the standardization process (gcc will likely implement it sooner though) – Spudd86 Jun 27 '10 at 02:52
  • 1
    I find the statement "as opposed to a copy cached locally in the thread" unclear. One can have single threaded code that still needs volatile (ie: for signal handler changed values). The caching involved is simpler, and volatile is usually only needed to instruct the compiler not to use an in-register copy or a stack spill copy of a previous load if convienent. – Peeter Joot Jul 23 '10 at 16:51
  • @Artelius. Not all platforms require a barrier for an atomic increment (examples: powerpc, sparc), instead may only need a compare and swap or similiar mechanism. Whether or not a barrier is required is a different and more complex issue. Best to use a mutex (which typically imply the appropriate barriers too when required). – Peeter Joot Jul 23 '10 at 16:55
4

No, you must use platform-dependent atomic accesses. There are several libraries that abstract these -- GLib provides portable atomic operations that fall back to mutex locks if necessary, and I believe Boost also provides portable atomics.

As I recently learned, for truly atomic access, you need a full memory barrier which volatile does not provide. All volatile guarantees is that the memory will be re-read at each access and that accesses to volatile memory will not be reordered. It is possible for the optimizer to re-order some non-volatile access before or after a volatile read/write -- possibly in the middle of your increment! -- so you must use actual atomic operations.

Community
  • 1
  • 1
Michael Ekstrand
  • 28,379
  • 9
  • 61
  • 93
3

On modern fast multicore processors, there is a significant overhead with atomic instructions due to caching and write buffers.

So compilers won't emit atomic instructions just because you added the volatile keyword. You need to resort to inline assembly or compiler-specific extensions (e.g. gcc atomic builtins).

I recommend using a library. The easy way is to just take a lock when you want to update the variable. Semaphores will probably be faster if they're appropriate to what you're doing. It seems GLib provides a reasonably efficient implementation.

Artelius
  • 48,337
  • 13
  • 89
  • 105
3

Windows provides InterlockedIncrement (and InterlockedDecrement) to do what you are asking.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
1

Volatile just prevents optimizations, but atomicity needs more. In x86, instructions must be preceeded by a LOCK prefix, in MIPS the RMW cycle must be surrounded by an LL/SC construct, ...

ninjalj
  • 42,493
  • 9
  • 106
  • 148
0

Your problem is that the C doesn't guarantee atomicity of the increment operators, and in practice, they often won't be atomic. You have to use a library like the Windows API or compiler builtin functions (GCC, MSVC) for that.

Christoph
  • 164,997
  • 36
  • 182
  • 240