Create a function that will block until it was called by more than n/2 threads (pseudocode)

Question

There are n threads. I'm trying to implement a function (pseudo code) which will directly block if it's called by a thread. Every thread will be blocked and the function will stop blocking threads if it was called by more than n/2 threads. If more than n/2 threads called the function, the function will no longer block other threads and will immediately return instead.

I did it like this but I'm not sure if I did the last part correctly where the function will immediately return if more than n/2 threads called it? :S

(Pseudocode is highly appreciated because then I have a better chance to understand it! :) )

int n = total amount of threads
sem waiter = 0
sem mutex = 1
int counter = 0

function void barrier()
    int x
    P(mutex)
    if counter > n / 2 then
        V(mutex)
        for x = 0; x <= n / 2; x++;
            V(waiter)
        end for
    end if
    else
        counter++
        V(mutex)
        P(waiter)
    end else
end function

John Bollinger · Answer 1 · 2020-10-12T12:08:09.797

What you describe is a non-resetting barrier. Pthreads has a barrier implementation, but it is of the resetting variety.

To implement what you're after with pthreads, you will want a mutex plus a condition variable, and a shared counter. A thread entering the function locks the mutex and checks the counter. If not enough other threads have yet arrived then it waits on the CV, otherwise it broadcasts to it to wake all the waiting threads. If you wish, you can make it just the thread that tips the scale that broadcasts. Example:

struct my_barrier {
    pthread_mutex_t barrier_mutex;
    pthread_cond_t barrier_cv;
    int threads_to_await;
};

void barrier(struct my_barrier *b) {
    pthread_mutex_lock(&b->barrier_mutex);
    if (b->threads_to_await > 0) {
        if (--b->threads_to_await == 0) {
            pthread_cond_broadcast(&b->barrier_cv);
        } else {
            do {
                pthread_cond_wait(&b->barrier_cv, &b->barrier_mutex);
            } while (b->threads_to_await);
        }
    }
    pthread_mutex_unlock(&b->barrier_mutex);
}

Update: pseudocode

Or since a pseudocode representation is important to you, here's the same thing in a pseudocode language similar to the one used in the question:

int n = total amount of threads
mutex m
condition_variable cv
int to_wait_for = n / 2

function void barrier()
    lock(mutex)

    if to_wait_for == 1 then
        to_wait_for = 0
        broadcast(cv)
    else if to_wait_for > 1 then
        to_wait_for = to_wait_for - 1
        wait(cv)
    end if

    unlock(mutex)
end function

That's slightly higher-level than your pseudocode, in that it does not assume that the mutex is implemented as a semaphore. (And with pthreads, which you tagged, you would need a pthreads mutex, not a semaphore, to go with a pthreads condition variable). It also omits the details of the real C code that deal with spurrious wakeup from waiting on the condition variable and with initializing the mutex and cv. Also, it presents the variables as if they are all globals -- such a function can be implemented that way in practice, but it is poor form.

Note also that it assumes that pthreads semantics for the condition variable: that waiting on the cv will temporarily release the mutex, allowing other threads to lock it, but that a thread that waits on the cv will reacquire the mutex before itself proceeding past the wait.

concision · Answer 2 · 2020-10-12T21:13:57.217

3

A few assumptions I am making within my answer:

P(...) is analogous to sem_wait(...)
V(...) is analogous to sem_post(...)
the barrier cannot be reset

I'm not sure if I did the last part correctly where the function will immediately return if more than n/2 threads called it

The pseudocode should work fine for the most part, but the early return/exit conditions could be significantly improved upon.

Some concerns (but nothing major):

The first time the condition counter > n / 2 is met, the waiter semaphore is signaled (i.e. V(...)) (n / 2) + 1 times (since it is from 0 to n / 2 inclusive), instead of n / 2 (which is also the value of counter at that moment).
Every subsequent invocation after counter > n / 2 is first met will also signal (i.e. V(...)) the waiter semaphore another (n / 2) + 1 times. Instead, it should early return and not re-signal.

These can be resolved with a few minor tweaks.

int n = total count of threads
sem mutex = 1;
sem waiter = 0;
int counter = 0;
bool released = FALSE;

function void barrier() {
    P(mutex)
    // instead of the `released` flag, could be replaced with the condition `counter > n / 2 + 1`
    if released then
        // ensure the mutex is released prior to returning
        V(mutex)
        return
    end if

    if counter > n / 2 then
        // more than n/2 threads have tried to wait, mark barrier as released
        released = TRUE
        // mutex can be released at this point, as any thread acquiring `mutex` after will see that `release` is TRUE and early return
        V(mutex)
        // release all blocked threads; counter is guaranteed to never be incremeneted again
        int x
        for x = 0; x < counter; x++
            V(waiter)
        end for
    else
        counter++
        V(mutex)

        P(waiter)
    end else
}

edited Oct 12 '20 at 21:13

answered Oct 12 '20 at 03:04

concision

6,029
11
29

In an actual C implementation, marking variables `volatile` does not prevent data races involving them. See [Why is volatile not considered useful in multithreaded C or C++ programming?](https://stackoverflow.com/q/2484980/2402272). As it is used, your `released` needs *atomic* semantics that are not clearly conveyed by the pseudocode or accompanying prose. – John Bollinger Oct 12 '20 at 12:12
@JohnBollinger `volatile` is not intended resolve data races in the pseudocode, but indicates **i)** the value should be re-fetched from main memory in the critical section and **ii)** instructions should not be reordered. All variables are used within the critical section - atomic semantics are not necessary if the variables are already guarded from parallel execution with the `mutex` lock. The only exception is the first `released` check is an early-return without necessarily acquiring the `mutex` lock, which may fail, but is recovered by another check in the critical section. – concision Oct 12 '20 at 19:50
@JohnBollinger A thread **T** that has decided it needs to wait reserves a release with `counter++` before releasing `mutex`. If another thread crosses the barrier and releases all threads, it will increment the semaphore `counter` times (which thread **T** incremented earlier in the critical section), allowing thread **T** to acquire `waiter` and continue, regardless of whether it has yet reached `P(waiter)`. No more than `counter` threads will have tried to wait on `waiter`, as they will have early-returned. There does not appear to be the race condition here that you indicated. – concision Oct 12 '20 at 19:50
@JohnBollinger A separate semaphore for each thread does not seem necessary here. The release mechanism is _not_ "release all _currently_ waiting", it is "increment the 'waiter' semaphore to the same amount of reservations that were allocated to 'counter', allowing 'counter' threads to be released". It would be erroneous to acquire `waiter` before releasing `mutex`, as `mutex` will be consumed and `waiter` will _never_ be released - all subsequent function calls would block indefinitely. A race condition can only somehow happen if the variables are not updated properly in the critical section. – concision Oct 12 '20 at 19:50
A "data race", in the C standard's usage of the term, is exactly what you have as a result of accessing `released` outside of a critical section given that other threads perform writes on that variable, notwithstanding whether those writes are performed in critical sections. C's `volatile` semantics do not rescue this, as the linked Q&A describes. If you want to check `released` outside of a critical section then it needs to be atomic, else the program has undefined behavior. `volatile` does nothing useful for you here. – John Bollinger Oct 12 '20 at 20:00
I withdraw (and have deleted) my other comments. – John Bollinger Oct 12 '20 at 20:06
@JohnBollinger Ah, my mistake in the terminology. Thank you for your feedback - you are absolutely correct that there is a data race here, but this was intentional as an optimization (but perhaps erroneous?). If you still suspect my understanding is wrong after the next comment's explanation, I will amend my answer for future readers. A counter-example scenario would also be sufficient. – concision Oct 12 '20 at 20:20
@JohnBollinger `released` is only toggled once throughout its lifetime - in the critical section. Consider the data race possibility: if a thread **T** reads `released` is `FALSE`, but is updated by another thread immediately after the read instruction, **T** will continue into the critical section and re-check `released` and will discover it is now _not_ `FALSE` and still early-return. In this scenario, the data race is recovered. _However_, if `released` was ever toggled back to `FALSE` in its lifetime, there would be a problematic data race, which I have acknowledged in my answer. – concision Oct 12 '20 at 20:20
You are making the mistake of attempting to reason about undefined behavior. There is no safe undefined behavior, and it disserves inexperienced programmers such as the OP to suggest otherwise. If I indulge in such an unwise reasoning exercise anyway then the worst I can imagine happening in this case is up to n / 2 threads all performing the release operations, changing your optimization into a pessimization, but my imagination may be insufficient. This answer would be improved by just removing the optimization attempt. – John Bollinger Oct 12 '20 at 21:03
@JohnBollinger Fair enough, I can agree with that is does not serve inexperienced programmers well. And furthermore, it definitely is an attempt to reason with undefined behavior, which may be ambiguous by the compiler. I have updated my answer to remove it - thanks for your input. – concision Oct 12 '20 at 21:16

Create a function that will block until it was called by more than n/2 threads (pseudocode)

2 Answers2

Update: pseudocode