Though all 5 threads run why the corresponding indexes are not properly outputted?
You pass a pointer to a variable to each thread, and modify that variable at the same time the thread functions access it. Why would you expect the thread functions to see any particular value? They run concurrently. It is possible for the threads to see a completely impossible, garbled value, if one thread is reading the value while another is modifying it, on some architectures.
For example in this case I have lost 3 and got 0 outputted twice.
Although the machine code generated by e.g. GCC increments the variable accessed by the thread functions after creating each thread function, the value observed by the thread functions may be "old" on some architectures, because no barriers or synchronization is used.
Whether this occurs on your particular machine (without explicit barriers or synchronization), depends on which memory ordering model your machine implements.
For example, on x86-64 (aka AMD64; the 64-bit Intel/AMD architecture), all reads and writes are observed in order, except that stores may be ordered after loads. This means that if initially say i = 0;
, and thread A does i = 1;
, thread B can still see i == 0
even after thread A modified the variable.
Note that adding the barriers (e.g. _mm_fence()
using x86/AMD64 intrinsics provided by <immintrin.h>
when using most C compilers) is not enough to ensure each thread sees an unique value, because the start of each thread can be delayed with respect to the real world moment when pthread_create()
was called. All they ensure is that at most one thread sees the zero value. Two threads may see value 1, three value 2, and so on; it is even possible for all threads to see value 5.
What should be done in this case to get all indexes printed?
The simplest option is to provide the index to be printed as a value, rather than as a pointer to a variable. In busy(), use
my_busy = (int)(intptr_t)c;
and in main(),
pthread_create(&t1[i], NULL, busy, (void *)(intptr_t)i);
The intptr_t
type is a signed integer type capable of holding a pointer, and is defined in <stdint.h>
(usually included by including <inttypes.h>
instead).
(Since the question is tagged linux, I probably should point out that in Linux, on all architectures, you can use long
instead of intptr_t
, and unsigned long
instead of uintptr_t
. There are no trap representations in either long
or unsigned long
, and every possible long
/unsigned long
value can be converted to an unique void *
, and vice versa; a round-trip is guaranteed to work correctly. The kernel syscall interface requires that, so it is extremely unlikely to change in the future either.)
If you need to pass the pointer to i
, but wish each thread to see an unique value, you need to use some sort of synchronization.
The simplest synchronized approach would be to use a semaphore. You could make it global, but using a structure to describe the work parameters, and passing the pointer of the structure (even if same one is used for all worker threads) is more robust:
#include <stdlib.h>
#include <pthread.h>
#include <semaphore.h>
#include <string.h>
#include <stdio.h>
#define NTHREADS 5
struct work {
int i;
sem_t s;
};
void *worker(void *data)
{
struct work *const w = data;
int i;
/* Obtain a copy of the value. */
i = w->i;
/* Let others know we have copied the value. */
sem_post(&w->s);
/* Do the work. */
printf("i == %d\n", i);
fflush(stdout);
return NULL;
}
int main()
{
pthread_t thread[NTHREADS];
struct work w;
int rc, i;
/* Initialize the semaphore. */
sem_init(&w.s, 0, 0);
/* Create the threads. */
for (i = 0; i < NTHREADS; i++) {
/* Create the thread. */
w.i = i;
rc = pthread_create(&thread[i], NULL, worker, &w);
if (rc) {
fprintf(stderr, "Failed to create thread %d: %s.\n", i, strerror(rc));
exit(EXIT_FAILURE);
}
/* Wait for the thread function to grab its copy. */
sem_wait(&w.s);
}
/* Reap the threads. */
for (i = 0; i < NTHREADS; i++) {
pthread_join(thread[i], NULL);
}
/* Done. */
return EXIT_SUCCESS;
}
Because the main thread, the thread that modifies the value seen by each worker thread, participates in the synchronization, so that each worker function reads the value before the next thread is created, the output will always be in increasing order of i
.
A much better approach is to create a work pool, where the main thread defines the work to be done collectively by the threads, and the thread functions simply obtain the next chunk of work to be done, in whatever order:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <limits.h>
#include <stdio.h>
#include <errno.h>
#define NTHREADS 5
#define LOOPS 3
struct work {
pthread_mutex_t lock;
int i;
};
void *worker(void *data)
{
struct work *const w = data;
int n, i;
for (n = 0; n < LOOPS; n++) {
/* Grab next piece of work. */
pthread_mutex_lock(&w->lock);
i = w->i;
w->i++;
pthread_mutex_unlock(&w->lock);
/* Display the work */
printf("i == %d, n == %d\n", i, n);
fflush(stdout);
}
return NULL;
}
int main(void)
{
pthread_attr_t attrs;
pthread_t thread[NTHREADS];
struct work w;
int i, rc;
/* Create the work set. */
pthread_mutex_init(&w.lock, NULL);
w.i = 0;
/* Thread workers don't need a lot of stack. */
pthread_attr_init(&attrs);
pthread_attr_setstacksize(&attrs, 2 * PTHREAD_STACK_MIN);
/* Create the threads. */
for (i = 0; i < NTHREADS; i++) {
rc = pthread_create(thread + i, &attrs, worker, &w);
if (rc != 0) {
fprintf(stderr, "Error creating thread %d of %d: %s.\n", i + 1, NTHREADS, strerror(rc));
exit(EXIT_FAILURE);
}
}
/* The thread attribute set is no longer needed. */
pthread_attr_destroy(&attrs);
/* Reap the threads. */
for (i = 0; i < NTHREADS; i++) {
pthread_join(thread[i], NULL);
}
/* All done. */
return EXIT_SUCCESS;
}
If you compile and run this last example, you'll notice that the output may be in odd order, but each i
is unique, and each n = 0
through n = LOOPS-1
occurs exactly NTHREADS
times.