4

To improve my C skills, I implemented a thread-safe and lock-free queue. The algorithm is from chapter 10.5 of the book "The Art of Multiprocessor Programming" by Maurice Herlihy and Nir Shavit which is a great book by the way.

So far, everything worked but I need help with the following problem:

Problem

The line free(first) is commented out in the lfq_deq() Method because it can cause a segfault if the queue is used by multiple dequeuers. If threads T1 and T2 are dequeueing and T1 frees the node while T2 still uses it, T2 will produce a segfault.

What is an elegant way of freeing this memory? Since I don't reuse the nodes, I should not have the ABA problem, right? Or do you think, it is easier to reuse the nodes and implement a known solution for the ABA problem?

lfq.h

The header provides a simple main method for testing.

#pragma once
#include <stdlib.h>

typedef struct Node {
    void* data;
    struct Node* next;
} lfq_node_t;

typedef struct Queue {
    lfq_node_t* head;
    lfq_node_t* tail;
} lfq_t;

lfq_t* lfq_new();
void lfq_free(lfq_t* q);
void lfq_enq(lfq_t* q, void* data);
void* lfq_deq(lfq_t* q);

lfq.c

#include "lfq.h"
#include <pthread.h>
#include <stdio.h>

#define CAS(a, b, c) __sync_bool_compare_and_swap(a, b, c)

lfq_t* lfq_new() {
    lfq_t* q = malloc(sizeof(*q));
    lfq_node_t* sentinel = malloc(sizeof(*sentinel));
    sentinel->data = sentinel->next = NULL;
    q->head = q->tail = sentinel;

    return q;
}

void lfq_free(lfq_t* q) {
    lfq_node_t *next, *node = q->head;
    while (node != NULL) {
        next = node->next;
        free(node);
        node = next;
    }
    free(q);
}

void lfq_enq(lfq_t* q, void* data) {
    lfq_node_t *node, *last, *next;

    node = malloc(sizeof(*node));
    node->data = data;
    node->next = NULL;

    while (1) {
        last = q->tail;
        next = last->next;
        if (last == q->tail) {
            if (next == NULL) {
                if (CAS(&(last->next), next, node)) {
                    CAS(&(q->tail), last, node);
                    return;
                }
            } else {
                CAS(&(q->tail), last, next);
            }
        }
    }
}

void* lfq_deq(lfq_t* q) {
    lfq_node_t *first, *last, *next;
    while (1) {
        first = q->head;
        last = q->tail;
        next = first->next;

        if (first == q->head) {
            if (first == last) {
                if (next == NULL) return NULL;
                CAS(&(q->tail), last, next);
            } else {
                void* data = first->next->data;
                if (CAS(&(q->head), first, next)) {
                    // free(first);
                    return data;
                }
            }
        }
    }
}

main.c

A simple main method to test the queue:

#include "lfq.h"
#include <stdio.h>

int main() {
    int values[] = {1, 2, 3, 4, 5};
    lfq_t* q = lfq_new();
    for (int i = 0; i < 5; ++i) {
        printf("ENQ %i\n", values[i]);
        lfq_enq(q, &values[i]);
    }
    for (int i = 0; i < 5; ++i) printf("DEQ %i\n", *(int*)lfq_deq(q));
    lfq_free(q);
    return 0;
}
p0wl
  • 658
  • 5
  • 10
  • A circular buffer (fixed-size array) is the usual solution for a queue. Garbage-collection actually makes a node-based linked list queue a lot easier by solving the deallocation problem. – Peter Cordes Jul 18 '18 at 08:34
  • 1
    BTW, you should probably be using C11 `stdatomic` instead of non-atomic variables with GNU C's [old (and not recommended anymore) `__sync_bool_compare_and_swap` builtin](https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html). `first = q->head;` ... stuff ... `if (first == q->head)` can CSE the two reads of `q->head` because it's not atomic (or `volatile`, but seriously just use `_Atomic` members in your struct. stdatomic acquire loads would solve that. C11 CAS gives you the current value of the memory location as well as a bool result, so you don't have to re-read it in the loop – Peter Cordes Jul 18 '18 at 08:42
  • Thanks! I will rewrite the CAS instructions and so on. I also implemented an array-based solution but I'm interested in an answer to this problem out of curiosity. – p0wl Jul 18 '18 at 08:42
  • Related: [Lock Free stack implementation idea - currently broken](https://stackoverflow.com/q/50803497) is about a node-based *stack* (very similar to queue), but BeeOnRope's answer about the memory reclamation problem is probably relevant to you, too. – Peter Cordes Jul 18 '18 at 11:47

2 Answers2

2

This looks like the Michael and Scott queue.

Nodes cannot be freed, I can't remember exactly why offhand (obviously because they could still be referenced - but exactly where and how I forget). They can only be placed onto a freelist.

I've not looked closely at your implementation for correctness, but I can see there are no memory barriers, which means the implementation is wrong.

You need to discover and read up and understand memory barriers, and then use them.

I have written a pair of articles which will get you started.

https://www.liblfds.org/mediawiki/index.php?title=Article:Memory_Barriers_%28part_1%29

https://www.liblfds.org/mediawiki/index.php?title=Article:Memory_Barriers_%28part_2%29

1

As Peter Cordes pointed out in his comment, I just found the Memory reclamation problem:

In contrast, memory reclamation is one of the most challenging aspects of lock-free data structure design. Lock-free algorithms (also called non-blocking algorithms) guarantee that as long as some process continues to take steps, eventually some process will complete an operation. The main difficulty in performing memory reclamation for a lock-free data structure is that a process can be sleeping while holding a pointer to an object that is about to be freed. Thus, carelessly freeing an object can cause a sleeping process to access freed memory when it wakes up, crashing the program or producing subtle errors. Since nodes are not locked, processes must coordinate to let each other know which nodes are safe to reclaim, and which might still be accessed.

Cited from "Reclaiming Memory for Lock-Free Data Structures: There has to be a Better Way" by Trevor Brown, Institute of Science and Technology, Austria

A good answer (related to a stack but essentially the same) can be found here.

Community
  • 1
  • 1
p0wl
  • 658
  • 5
  • 10