3

enter image description here enter image description here

Also I am doing a c implementation and currently have the structure of the queue:

typedef struct queueelem {
    queuedata_t data;
    struct queueelem *next;
} queueelem_t;

typedef struct queue {
    int capacity;
    int size;
    queueelem_t *head;
    queueelem_t *tail;
} queue_t;

queue_t *
queue_init(int capacity)
{
    queue_t *q = (queue_t *) malloc(sizeof(queue_t));
    q->head = q->tail = NULL;
    q->size = 0;
    q->capacity = capacity;
    return q;
}

int CompareAndExchange (void **a, void *comparand,void *new) {
    int success = 0;
    pthread_mutex_lock(&CE_MUTEX);
    if ((*a) != comparand) {
       (*a) = new;
       //return     TRUE
       success = 1;
    }
    pthread_mutex_unlock(&CE_MUTEX);     
   //return     FALSE
    return success;
 }

But not sure How to continue, with queue and dequeue functions...

  • How would the code look like?
edgarmtze
  • 24,683
  • 80
  • 235
  • 386
  • 1
    Compare&Swap and Fetch&Add are extension functions for the CPU probably built in your compiler/library – Jonas Bötel May 22 '11 at 15:58
  • I was doing the implementation of CompareAndExchange see my update. – edgarmtze May 22 '11 at 16:02
  • 1
    Lock-free queues are unicorns, you'll have to use the low-lock primitives provided by your runtime or operating system. FetchAndAdd, CompareAndSwap in the sample algorithm. That's where the buck stops, you didn't document your runtime environment. – Hans Passant May 22 '11 at 16:05
  • I am doing this on Ubuntu 10.04.1 OS, with gcc.. How would I call `FetchAndAdd, CompareAndSwap` in the code, what liebrary must be included – edgarmtze May 22 '11 at 16:09
  • 4
    @darkcminor: You are aware, that you are using a *lock* to implement an essential utility function for your queue, therefore it isn't a *lock-free* queue! – Jonas Bötel May 22 '11 at 16:13
  • @LumpN: Could you point me in the right direction? – edgarmtze May 22 '11 at 16:16
  • 1
    For compare-and-swap and fetch-and-add you may use GCC's atomic builtins: http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html – Alexey Kukanov May 22 '11 at 19:05

5 Answers5

3

Sometime ago, I've found a nice solution to this problem. I believe that it the smallest found so far.

The repository has a example of how use it to create N threads (readers and writers) and make then share a single seat.

I made some benchmarks, on the test example and got the following results (in million ops/sec) :

By buffer size

throughput

By number of threads

enter image description here

Notice how the number of threads do not change the throughput.

I think this is the ultimate solution to this problem. It works and is incredible fast and simple. Even with hundreds of threads and a queue of a single position. It can be used as a pipeline beween threads, allocating space inside the queue.

The repository has some early versions written in C# and pascal. Im working to make something more complete polished to show its real powers.

I hope some of you can validate the work or help with some ideas. Or at least, can you break it?

bittnkr
  • 99
  • 10
2

Your pseudo-code can (and most likely does) suffer from the ABA problem, as only the pointer is checked, and not an accompanying unique stamp, you'll find this paper of use in that regard and as a general guide to lock-free queue implementation, with its pitfalls.

When dealing with lock free programing, its also a good idea to read up on Herb Sutter's works, as He gives good, insightful explanations to whats required, why its required and its potential weak points (though beware that some of his older publications/articles where found to contain some hidden/unforseen problems).

Necrolis
  • 25,836
  • 3
  • 63
  • 101
  • Sometimes ABA does not matter, if an object's address is a permanent identifier for it. For example one place I was trying to use lock-free queues was in a heap manager, where the location of an object (a free block on the heap) was its identity. The same would apply if you were working with long-lived objects of another sort that are merely taken in and out of different lists without ever being freed or allocated. – R.. GitHub STOP HELPING ICE May 23 '11 at 02:33
2

and also the recent boost'con talk about this subject : https://github.com/boostcon/2011_presentations/raw/master/wed/lockfree_2011_slides.pdf

Joel Falcou
  • 6,247
  • 1
  • 17
  • 34
0

You may try this library it is built in c native. lfqueue

For Example

int* int_data;
lfqueue_t my_queue;

if (lfqueue_init(&my_queue) == -1)
    return -1;

/** Wrap This scope in other threads **/
int_data = (int*) malloc(sizeof(int));
assert(int_data != NULL);
*int_data = i++;
/*Enqueue*/
 while (lfqueue_enq(&my_queue, int_data) == -1) {
    printf("ENQ Full ?\n");
}

/** Wrap This scope in other threads **/
/*Dequeue*/
while  ( (int_data = lfqueue_deq(&my_queue)) == NULL) {
    printf("DEQ EMPTY ..\n");
}

// printf("%d\n", *(int*) int_data );
free(int_data);
/** End **/

lfqueue_destroy(&my_queue);
Oktaheta
  • 606
  • 5
  • 21
0

(Leaving this here for now, but see edit.)

Do you know a implementation of lock free queue in C?

I wrote lockless queue recently (http://www.ideone.com/l2QRp). I can't actually guarantee it works correctly, but I can't find any bugs and I've used it in a couple of single threaded programs without any problems, so there's nothing too obvious wrong with it.

Trivial usage example:

queue_t queue;
int val = 42;
queue_init(&queue,sizeof val);
queue_put(&queue,&val);
val = 0; 
queue_pop(&queue,&val);
printf("%i\n",val); // 42
queue_destroy(&queue);

Edit:

As @Alexey Kukanov pointed out, queue_pop can fail if tmp is popped,freed,allocated again, and put again between checking for null and swapping:

    if(!tmp->next) return errno = ENODATA;
    /* can fail here */
    } while(!sync_swap(q->head,tmp,tmp->next));

I'm not yet sure how to fix this, but I'll (hopefully) update this once I figure it out. For now, disregard this.

David X
  • 3,998
  • 3
  • 29
  • 32
  • What's the point of a lock-free queue for single-threaded code? In multi-threaded code, it will break because of the ABA problem. – Alexey Kukanov May 22 '11 at 19:25
  • @Alexey Kukanov, it's designed (not necessarily implemented, but designed) to not need to care what's behind the "A", as long the A is there, it should work correctly, even if the data behind the A has chenged. I do really need to go though it and verify that that's true, though. – David X May 23 '11 at 00:48
  • 2
    The slides referenced in the answer of Joel Falcou explain why your `queue_pop` suffers from ABA. There's a timing window between reading tmp->next and the CAS; whatever small, it's enough to make the value you read stale. – Alexey Kukanov May 23 '11 at 05:41
  • 1
    If you always add new items to the list there is NO ABA. Only when the same item is removed and then re-added do you have this problem. – johnnycrash Jun 15 '11 at 20:16
  • we always add items to the head of the list, not the tail. In fact, we don't have a tail ptr. Thus you have a single CAS and no races. – johnnycrash Jun 15 '11 at 20:17