19

On a multicore x86 machine, Say a thread executing on core1 increments an integer variable a at the same time thread on core 2 also increments it. Given that the initial value of a was 0, would it always be 2 in the end? Or it could have some other value? Assume that a is declared as volatile and we are not using atomic variables (such as atomic<> of C++ and built in atomic operations in gcc).

If the value of a would indeed be always 2 in such event, does that mean that a long int in x86-64 would also have the same property, that is, a will always be 2 in the end?

pythonic
  • 20,589
  • 43
  • 136
  • 219
  • 4
    unless you are using a special atomic type, increment is usually three separate operations. Load, Increment, then Store. – Hunter McMillen May 08 '12 at 17:49
  • 5
    `volatile` doesn't give you atomic access. – Cat Plus Plus May 08 '12 at 17:50
  • 3
    @CatPlusPlus is your name an atomic operation? :P – MByD May 08 '12 at 17:51
  • accessing same type would result in race condition and if not taken care properly the value would be unexpected . here in this case say thread 1 and thread 2 loads value at same time and makes it 2 each so no matter whether thread 1 or thread 2 whoever loads it first will lead the final value as 2 – Invictus May 08 '12 at 17:52
  • @CatPlusPlus: There is an extension to the MS compiler that does make volatile variables have atomic writes (that are thread atomic). I don't think this extends to g++. Also I am not 100% familiar with the functionality as I don't use. So see http://msdn.microsoft.com/en-us/library/12a04hfd.aspx – Martin York May 08 '12 at 17:57
  • 3
    Reading and writing are atomic as long as you don't cross a cache line boundary. Doing both with another operation between is *not* automatically atomic. – Mark Ransom May 08 '12 at 17:57
  • @MarkRansom Maybe. Depending on how you define "atomic". Most recent use (including in the C++ standard) puts a stricter meaning to it, so that without special additional instructions, no writes or reads are atomic. – James Kanze May 08 '12 at 18:18

4 Answers4

30

The increment-memory machine instruction on an X86 is atomic only if you use it with a LOCK prefix.

x++ in C and C++ doesn't have atomic behavior. If you do unlocked increments, due to races in which processor is reading and writing X, if two separate processors attempt an increment, you can end up with just one increment or both being seen (the second processor may have read the initial value, incremented it, and written it back after the first writes its results back).

I believe that C++11 offers atomic increments, and most vendor compilers have an idiomatic way to cause an atomic increment of certain built-in integer types (typically int and long); see your compiler reference manual.

If you want to increment a "large value" (say, a multiprecision integer), you need to do so with using some standard locking mechanism such as a semaphore.

Note that you need to worry about atomic reads, too. On the x86, reading a 32 or 64 bit value happens to be atomic if it is 64-bit word aligned. That won't be true of a "large value"; again you'll need some standard lock.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
10

Here's one proof it is not atomic in a particular implementation (gcc), As you can see (?), gcc generates code that

  1. loads the value from memory to a register
  2. increments the content of the register
  3. saves the register back to memory.

That's very far from being atomic.

$ cat t.c
volatile int a;

void func(void)
{
    a++;
}
[19:51:52 0 ~] $ gcc -O2 -c t.c
[19:51:55 0 ~] $ objdump -d t.o

t.o:     file format elf32-i386


Disassembly of section .text:

00000000 <func>:
   0:   a1 00 00 00 00          mov    0x0,%eax
   5:   83 c0 01                add    $0x1,%eax
   8:   a3 00 00 00 00          mov    %eax,0x0
   d:   c3                      ret

Don't be fooled by the 0x0 in the mov instruction, there's room for 4 bytes there, and the linker will fill in the resulting memory address for a there when this object file is linked.

nos
  • 223,662
  • 58
  • 417
  • 506
  • That's funny, you actually only get separate load/add/store when `a` is `volatile`, otherwise gcc uses a read-modify-write (unless tuning for [i586 (pentium)](https://godbolt.org/g/3K8zfu)). Of course, if the surrounding code uses the value of num++, it will very likely be done with separate instructions to leave the result in a register. I mentioned this in my answer on [a non-`volatile` version the question](http://stackoverflow.com/questions/39393850/can-num-be-atomic-for-int-num/39396999#39396999). – Peter Cordes Oct 11 '16 at 23:56
  • @PaterCordes Without volatile, gcc might generate a single `addl ` instruction, though that's not atomic either unless you also use the `lock` prefix. – nos Oct 12 '16 at 07:36
  • :P The 2nd link in my previous comment is to my answer which explains exactly (in terms of the MESI protocol, etc.) why `addl $1, num` isn't atomic (except on a uniprocessor system if we don't include DMA observers), and why it is with `lock`. – Peter Cordes Oct 12 '16 at 16:11
9

Since no one has answered your actual question and instead are showing you how to do it in a way that always works:

Thread 1 loads value of 0

Thread 2 loads value of 0

Thread 1 increments an stores 1

Thread 2 increments its local register copy of value and stores 1.

As you can see the end result is a value equal to 1 and not 2. It will not always be 2 at the end.

Michael Dorgan
  • 12,453
  • 3
  • 31
  • 61
6

It’s not guaranteed. You can use the lock xadd instruction to achieve the same effect, or use C++ std::atomic, or use #pragma omp atomic, or any number of other concurrency solutions that have been written to save you the trouble of reinventing the wheel.

Jon Purdy
  • 53,300
  • 8
  • 96
  • 166