2

I'm trying to implement the simplest spinlock (using TAS) in C using inline assembly with the command xchg. Since my compiler error messages are getting more and more exotic and I'm starting to grow grey hairs I've decided to ask here. Also I'm sorry if this question was already answered since I haven't found anything.

What might needs to be said about my programming experience concerning this topic. I'm doing pretty fine with C (in my opinion, considering standard problems). Also I believe to know the basics about x86 but I'm totally lost when it comes to constraints in inline assembler. What I found doing some googling is even more confusing to me since many sources say very different stuff.

My code so far:

int acquire_lock(int* lock){
    int val = 1;
    int lock_cont;
    while((lock_cont = *lock) != 0){
            __asm__("xchg %0 %1" : "+q" (val), "+m" (lock_cont));
    }
    return 0;
}

This doesn't work for reasons that are probably obvious but are making me go nuts. I've also tried some other variants but none of them even compiled. You can probably tell by now that I don't really know what I'm doing so I'd be more than happy for any kind of advice.

Here my compiler messages in case this helps:

my_lock.c:17:11: error: unexpected token in argument list
            __asm__("xchg %0 %1" : "+q" (val), "+m" (lock_cont));
                    ^
<inline asm>:1:12: note: instantiated into assembly here
    xchg %eax -16(%rbp)
              ^
1 error generated.

Thanks in advance

A desperate student

EDIT:

I got the lock to work.. a do while loop and the comma did the trick. Now I have a new problem that my lock implementation still doesn't seem to guarantee exclusive access.. I'll post the whole code and would be happy for any suggestions/critics.

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>

//shared variable
int x;

//new lock instance that's consistent over function calls
int* new_lock(){
        int* ptr = (int*)(malloc(sizeof(int)));
        *ptr = 0;
        return ptr;
}

//set content of lock atomically to 1
int acquire_lock(int* lock){
        int val = 1;
        do{
                __asm__("xchg %0, %1" : "+q" (val), "+m" (*lock));
        }while(val - (*lock) == 0);
        return 0;
}

//release the lock
int release_lock(int* lock){
        *lock = 0;
        return 0;
}

//free lock
int delete_lock(int* ptr){
        free(ptr);
        return 0;
}

//thread counts to 10^6
void* thread_do(void* arg){
        int* lock = (int*) arg;
        for(int i = 0; i < 100000; i++){
                acquire_lock(lock);
                x++;
                release_lock(lock);
        }
        return NULL;
}

int main(int argc, char** argv){
        pthread_t thread0, thread1;
        int* lock = new_lock();
        pthread_create(&thread0, NULL, thread_do, lock);
        pthread_create(&thread1, NULL, thread_do, lock);
        pthread_join(thread0, NULL);
        pthread_join(thread1, NULL);
        printf("%d\n",x);
        return 0;
}

EDIT2:

My lock actually DOES work as can be seen when locking the whole loop inside the thread_do function. Not quite happy with this result since this locks x for quite a long time but I guess I'll have to live with this.. I assume the problem is that between my asm instruction and the comparison from the while I can't guarantee atomicity when the locking and unlocking is such a fast stream of instructions (for loop in thread_do) since I don't see a workaround in C (suggestions are welcome), I'll stick to this implementations since the general idea seems to be right.

TropicalSpore
  • 119
  • 2
  • 9

1 Answers1

3

If you're trying to make a spinlock, you'll probably want to use a strong atomic compare exchange.

Here is a simple implementation of you acquire_lock using a GCC builtin :

int acquire_lock(int* lock) 
{
    while (__sync_val_compare_and_swap (lock, 0, 1) != 0)
    {
        // Do something while waiting for the lock ?
    }
    return 0;
}

Compiler builtins have the advantage of being more readable and somewhat more portable than inline ASM.


Regarding the error in your code, you're missing a comma between the operands. That line :

__asm__("xchg %0 %1" : "+q" (val), "+m" (lock_cont));

Should be :

__asm__("xchg %0, %1" : "+q" (val), "+m" (lock_cont));
tux3
  • 7,171
  • 6
  • 39
  • 51
  • Thank you a lot! But since this is an exercise for an Operating Systems course and we have the possibility to use the gcc builtin and/or inline assembly with the view of more assembly to come I decided to stick with less readable. The missing comma is an extremely embarrassing mistake.. Problem is with the while loop it was given as is without explanation about the asm part (or any inline asm) but the more I think about it the less THIS can work. I'll think about it again and if I come up with a working solution using xchg I'll post it in case anybody else will ever face this problem. – TropicalSpore Mar 10 '15 at 16:51
  • @Desperate Perhaps you'll want to look at [the example on Wikipedia](http://en.wikipedia.org/wiki/Spinlock#Example_implementation) if you want an ASM implementation using xchg. It's the same algorithm you're using. – tux3 Mar 10 '15 at 16:55
  • Thanks again! Wikipedia said what I already suspected.. namely that I need ed a do while loop instead of a while loop. My lock now kind of works but it still doesn't guarantee exclusive access.. but since it works I'll try to figure out what the remaining problem(s) is/are. – TropicalSpore Mar 10 '15 at 17:53
  • For a spinlock, I think it's better to 1. check if the lock is free. 2. try to take it with `xchg` to store a 1 and see the old value. 3. If `xchg` loaded a 1, another thread took the lock between 1 and 2: so goto 1 (and spin on read-only checking the lock, *not* on lock cmpxchg). You never want to spin on a `lock`ed bus cycle: that makes every thread waiting for the lock slow the system down. – Peter Cordes May 16 '16 at 02:33
  • For example, see [my asm spinlock implementation](http://stackoverflow.com/a/37246263/224132) in an answer to another question. Implementing as much of that as possible with pure C (e.g. with C11 `stdatomic` for the release-store in the unlock function) would be good. It's kind of an example of the asm you want the compiler to emit. – Peter Cordes Jun 11 '16 at 21:39