I was looking further into pthread barriers following the pthread tutorial at pthread Tutorial - Peter Chapin, 3.2 Barriers pg 11 going through the use of two barriers in the thread function, the first suspends all threads until all have reached the loop_barrier
confirmed by the arbitrary thread elected to do any serial cleanup following the barrier return of PTHREAD_BARRIER_SERIAL_THREAD
and a subsequent barrier in the thread function of prep_barrier
which ensures all threads are suspended until the serial cleanup is done.
My understanding being that this allows threads to run continually while providing thread synchronization at a given pointer in the processing where all work on a per-cycle basis is completed before all threads continue running in a concurrent manner. The example simply shows what occurs for one such cycle and then a done
flag is set and the thread function returns.
All threads do suspend and wait on the loop_barrier
and prep_barrier
, but the problem is that following thread function return the program stalls on the first pthread_join()
which gdb
explaiins, rather unhelpfully, is the result of "in pthread_barrier_destroy () from /lib64/libpthread.so.0"
The tutorial provides only a framework for the thread function and main program and I simply provided the minimum to complete it, declaring a struct to hold the different loop-limits for the for
loop in each thread function and members to hold the thread index and sum of the for
loop variable values. Apparently I didn't understand the barriers quite a completely as I thought I did. The code causing the hang-on-join is:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <pthread.h>
#define handle_error_en(en, msg) \
do { errno = en; perror(msg); exit(EXIT_FAILURE); } while (0)
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
#define NCPU 4
#define ITER_PER_CPU 100
typedef struct {
int index, start, end;
unsigned sum;
} loop_data;
pthread_barrier_t loop_barrier; /* global barriers (could pass in data) */
pthread_barrier_t prep_barrier;
void *thread_fn (void *data)
{
int done = 0,
i = 0;
loop_data *thread_data = data;
do {
for (i = thread_data->start; i < thread_data->end; i++) {
/* each arg gets separate loop_data - do work */
thread_data->sum += i;
}
/* suspend on barrier and do any per-cycle cleanup */
if (pthread_barrier_wait (&loop_barrier) == PTHREAD_BARRIER_SERIAL_THREAD) {
puts ("PTHREAD_BARRIER_SERIAL_THREAD");
/* no actual per-cycle cleanup, just set done flag */
done = 1;
}
/* suspend on barrier until per-cycle cleanup complete */
pthread_barrier_wait (&prep_barrier);
printf ("thread index: %d, sum: %d\n",
thread_data->index, thread_data->sum);
} while (!done);
return data;
}
int main (void) {
pthread_t id[NCPU];
pthread_attr_t attr;
loop_data arr[NCPU] = {{ .start = 0 }};
void *res;
int rtn = 0;
/* initialize barriers and validate */
if ((rtn = pthread_barrier_init (&loop_barrier, NULL, NCPU))) {
handle_error_en (rtn, "pthread_barrier_init-loop_barrier");
}
if ((rtn = pthread_barrier_init (&prep_barrier, NULL, NCPU))) {
handle_error_en (rtn, "pthread_barrier_init-prep_barrier");
}
/* initialize thread attributes (using defaults) and validate */
if ((rtn = pthread_attr_init (&attr))) {
handle_error_en (rtn, "pthread_attr_init");
}
/* set data index, start, end and create/validate each thread */
for (int i = 0; i < NCPU; i++) {
/* initialize index, start / end values */
arr[i].index = i;
arr[i].start = i * ITER_PER_CPU;
arr[i].end = (i + 1) * ITER_PER_CPU;
printf ("id: %d, start: %3d, end: %3d\n", i, arr[i].start, arr[i].end);
/* create thread and validate */
if ((rtn = pthread_create (&id[i], &attr, thread_fn, &arr[i]))) {
handle_error_en (rtn, "pthread_create");
}
}
/* join all threads and compare sums from threads with sums in main */
for (int i = 0; i < NCPU; i++) {
loop_data *data = NULL;
/* join and validate */
printf ("joining thread index: %d\n", i);
if ((rtn = pthread_join (id[i], &res))) {
fprintf (stderr, "error: thread %d\n", i);
handle_error_en (rtn, "pthread_join");
}
data = res; /* pointer to return struct provided through parameter */
printf ("thread index: %d joined\n", data->index);
}
/* destroy barriers and validate */
if ((rtn = pthread_barrier_destroy (&loop_barrier))) {
handle_error_en (rtn, "pthread_barrier_destroy-loop_barrier");
}
if ((rtn = pthread_barrier_destroy (&prep_barrier))) {
handle_error_en (rtn, "pthread_barrier_destroy-prep_barrier");
}
}
Example Use/Output
$ ./bin/pthread-vtctut-04
id: 0, start: 0, end: 100
id: 1, start: 100, end: 200
id: 2, start: 200, end: 300
id: 3, start: 300, end: 400
joining thread index: 0
PTHREAD_BARRIER_SERIAL_THREAD
thread index: 2, sum: 24950
thread index: 0, sum: 4950
thread index: 1, sum: 14950
thread index: 3, sum: 34950
^C
The manual interrupt provided where the code hangs on line 94
at if ((rtn = pthread_join (id[i], &res))) {
. So why since each thread function is released by the second barrier (as indicated by the "thread index: x, sum: yyyy"
output does the code hang on pthread_join()
in main()
?