Does this simple non-blocking queue fail? (in C with gcc atomic CAS)

Question

I have been using this kind of construction for many years (with different versions of atomic CAS) for a "single producer single consumer" FIFO queue. I think it produces correct behavior in that case. Now I am wondering how it works if there are multiple threads enqueueing and dequeueing. I can see that the queue order will be corrupted whenever a dequeue happens while another thread is in the process of reversing the output queue. But, it seems like this would be fairly rare, and it is not clear why precise queue ordering could be important when there are multiple consumers? I mean, if the threads are going to proceed independently once they have dequeued an item, not sure why it could matter in what order they were dequeued? At least in my application it doesn't. I want to generally do the work in the order it is presented, but no big deal if somebody cuts to the head of the line once in a while. In the case where operation ordering is essential, I serialize onto the same thread.

Are there any catastrophic failures where the same element is returned multiple times or not at all?

typedef struct QElem_t {
    struct QElem_t  *next;
} QElem_t;

typedef struct QHeader_t {
    QElem_t     *input;
    QElem_t     *output;
} QHeader_t;

void QEnqueue(QHeader_t *qh,QElem_t *qe)
{
    QElem_t *input;
    do {    // push onto input queue
        input = qh->input;
        qe->next = input;
    } while( __sync_bool_compare_and_swap(&qh->input,input,qe) == 0 );
}

QElem_t *QDequeue(QHeader_t *qh)
{
    QElem_t *output,*input,*qe;

    // pop head of output queue
    while( (qe = qh->output) != NULL ) {
        if( __sync_bool_compare_and_swap(&qh->output,qe,qe->next) ) {
            return qe;
        }
    }
    // output queue is empty, grab entire input queue
    do {
        input = qh->input;
    } while( __sync_bool_compare_and_swap(&qh->input,input,NULL) == 0 );

    if( !input ) return input;
    // reverse input queue onto output queue
    while( 1 ) {
        qe = input;
        if( (input = qe->next) == NULL ) {
            return qe;  // return tail of input queue
        }
        // else push onto output
        do {
            output = qh->output;
            qe->next = output;
        } while( __sync_bool_compare_and_swap(&qh->output,output,qe) == 0 );
    }
}

Just noticed that for all these years, I have been doing an unnecessary CAS whenever I grab the input queue and it is NULL (which is pretty common case). Can't believe I never saw that till looking at my code through the eyes of others here. Guess I was always most worried about whether it actually worked. — user1055568, Feb 03 '15 at 20:31

score 0 · Answer 1 · edited May 23 '17 at 11:49

Found another question here that identifies the key problem with this code:

// pop head of output queue
while( (qe = qh->output) != NULL ) {
    if( __sync_bool_compare_and_swap(&qh->output,qe,qe->next) ) {
        return qe;
    }
}

qe->next may be evaluated at a time when qe != qh->output. This can happen if another thread pops qe from the stack after qe = qh->output, and then pushes it back onto the stack before the compare_and_swap executes, but after qe->next has been evaluated for the function call. To fix, you need to atomically read qh->output and qh->output->next so you are never reading a next value when the element is not on the queue.

Additional problem in this code may be using ordinary assigns to read the queue head into local variable, as one cannot be certain when it will execute.

how to prevent corruption in concurrent lifo stack implemented with atomic compare and swap

Does this simple non-blocking queue fail? (in C with gcc atomic CAS)

1 Answers1