35

I have read quite some posts that say compare and swap guarantees atomicity, However I am still not able to get how does it. Here is general pseudo code for compare and swap:

int CAS(int *ptr,int oldvalue,int newvalue)
{
   int temp = *ptr;
   if(*ptr == oldvalue)
       *ptr = newvalue
   return temp;
}

How does this guarantee atomicity? For example, if I am using this to implement a mutex,

void lock(int *mutex)
{  
    while(!CAS(mutex, 0 , 1));
}

how does this prevent 2 threads from acquiring the mutex at the same time? Any pointers would be really appreciated.

Martin Konicek
  • 39,126
  • 20
  • 90
  • 98
StackSmasher
  • 359
  • 1
  • 3
  • 6
  • 6
    Sure: `ptr`, `mutex`, `NULL`... – Kerrek SB Mar 12 '14 at 00:18
  • It is safer not to implement your own mutex, but use some from the system, like `pthread_mutex_lock`/_unlock from glibc (with interface defined in POSIX: http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_lock.html) – osgx Mar 12 '14 at 00:29
  • 2
    I am not implementing my own mutex. Just wanted to give an example to clarify what I meant. – StackSmasher Mar 12 '14 at 01:32

3 Answers3

45

"general pseudo code" is not an actual code of CAS (compare and swap) implementation. Special hardware instructions are used to activate special atomic hardware in the CPU. For example, in x86 the LOCK CMPXCHG can be used (http://en.wikipedia.org/wiki/Compare-and-swap).

In gcc, for example, there is __sync_val_compare_and_swap() builtin - which implements hardware-specific atomic CAS. There is description of this operation from fresh wonderful book from Paul E. McKenney (Is Parallel Programming Hard, And, If So, What Can You Do About It?, 2014), section 4.3 "Atomic operations", pages 31-32.

If you want to know more about building higher level synchronization on top of atomic operations and save your system from spinlocks and burning cpu cycles on active spinning, you can read something about futex mechanism in Linux. First paper on futexes is Futexes are tricky by Ulrich Drepper 2011; the other is LWN article http://lwn.net/Articles/360699/ (and the historic one is Fuss, Futexes and Furwocks: Fast Userland Locking in Linux, 2002)

Mutex locks described by Ulrich use only atomic operations for "fast path" (when the mutex is not locked and our thread is the only who wants to lock it), but if the mutex was locked, the thread will go to sleeping using futex(FUTEX_WAIT...) (and it will mark the mutex variable using atomic operation, to inform the unlocking thread about "there are somebody sleeping waiting on this mutex", so unlocker will know that he must wake them using futex(FUTEX_WAKE, ...)

osgx
  • 90,338
  • 53
  • 357
  • 513
  • 1
    As I undersand it, CAS also use lock, but on hardware level? So it's not lock-free, it's only lock free on application level, but not hardware level? – lalilulelo_1986 Dec 29 '21 at 18:19
6

How does it prevent two threads from acquiring the lock? Well, once any one thread succeeds, *mutex will be 1, so any other thread's CAS will fail (because it's called with expected value 0). The lock is released by storing 0 in *mutex.

Note that this is an odd use of CAS, since it's essentially requiring an ABA-violation. Typically you'd just use a plain atomic exchange:

while (exchange(mutex, 1) == 1) { /* spin */ }

// critical section

*mutex = 0;   // atomically

Or if you want to be slightly more sophisticated and store information about which thread has the lock, you can do tricks with atomic-fetch-and-add (see for example the Linux kernel spinlock code).

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
2

You cannot implement CAS in C. It's done on a hardware level in assembly.

nz_21
  • 6,140
  • 7
  • 34
  • 80
  • You can get CAS from the language itself in C11 (https://en.cppreference.com/w/c/atomic). You can also use https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html and https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html with GCC-compatible C compilers. You can also use OpenACC and OpenMP directives in C programs. OpenMP atomics are supported by all major compilers except Apple Clang. – Jeff Hammond Mar 08 '22 at 08:17