0

I am learning pthreads.

Right now I am trying to make the program that writes to one 2d array using multiple pthreads. Each pthread is responsible for only one line of the array. So there is no race or overlap there. The goal is to make it as fast as possible without using global variables.

The first solution that I implemented was the one that uses a global variable. And it works as intended. Here is the code:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

int **array;
const int NTHREADS = 5;
const int ELEMENTS = 3;

void *worker(void *arg);
void print_array(int **array);

int main()
{
    int i, j;
    pthread_t* threads = (pthread_t*)malloc(NTHREADS * sizeof(pthread_t));

    array = (int**)malloc(sizeof(int*));
    for(i = -1; i < NTHREADS; i++)
    {
        array[i] = (int*)malloc(sizeof(int));
        for (j = -1; j < ELEMENTS; j++)
        {
            array[i][j] = (int)malloc(sizeof(int));
        }
    }

    for (i = 0; i < NTHREADS; i++)
        pthread_create(&threads[i], NULL, worker, (void*)i);

    for (i = 0; i < NTHREADS; i++)
        pthread_join(threads[i], NULL);

    print_array(array);
    return 0;
}

void *worker(void *arg)
{
    int tid = (int)arg;

    for (int j = 0; j < ELEMENTS; j++)
        array[tid][j] = j;
    return (NULL);
}

void print_array(int **array)
{
    for (int i = 0; i < NTHREADS; i++)
    {
        for (int j = 0; j < ELEMENTS; j++)
            printf("%d,", array[i][j]);

        printf("\n");
    }
}

Then I wrote the same program using struct instead of global variable. Here is the code:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

const int NTHREADS = 5;
const int ELEMENTS = 3;

typedef struct          s_asd
{
    int                 **array;
    int                 tid;
}                       t_asd;

void *worker(void *arg);
void print_array(int **array);

int main()
{
    pthread_t* threads = (pthread_t*)malloc(NTHREADS * sizeof(pthread_t));
    t_asd tmp;
    int i, j;

    tmp.array = (int**)malloc(sizeof(int*));
    for (i = 0; i <= NTHREADS; i++)
    {
        tmp.array[i] = (int*)malloc(sizeof(int));
        for (j = 0; j <= ELEMENTS; j++)
            tmp.array[i][j] = (int)malloc(sizeof(int));
    }

    for (tmp.tid = 0; tmp.tid < NTHREADS; tmp.tid++)
        pthread_create(&threads[tmp.tid], NULL, worker, &tmp);

    for (i = 0; i < NTHREADS; i++)
        pthread_join(threads[i], NULL);

    print_array(tmp.array);
    return 0;
}

void *worker(void *arg)
{
    t_asd   *tmp = (t_asd*)arg;

    for (int j = 0; j < ELEMENTS; j++)
        tmp->array[tmp->tid][j] = j;
    return (NULL);
}

void print_array(int **array)
{
    for (int i = 0; i < NTHREADS; i++)
    {
        for (int j = 0; j < ELEMENTS; j++)
            printf("%d,", array[i][j]);

        printf("\n");
    }
}

This one, prints random numbers. I know that I am using the same pointer in all threads, but threads themselves, are not using the same memory area. So why does it prints random numbers? What is the best solution, without using a global variable?

Update 1. Output of the second program:

-1413467520,32668,-1413467440,
-1413467584,-1413467568,-1413467552,
-1413467504,-1413467488,-1413467472,
0,1,2,
0,1,2,
Soul Bruteflow
  • 115
  • 2
  • 8
  • Can you show the output please ? – ShellCode May 16 '17 at 12:30
  • You should use mutexes to avoid concurrent access. – Silveris May 16 '17 at 12:33
  • Your use of malloc is wrong. `array = (int**)malloc(sizeof(int*));` only allocates space for one pointer. – riodoro1 May 16 '17 at 12:35
  • @ShellCode I updated the question with the output of the second program. Output itself may vary. And garbage values will be in different places. – Soul Bruteflow May 16 '17 at 12:54
  • @Silveris I know that with mutex I can achieve this. But won't mutex slow down the threads? And in this case, all threads accessing differents parts of the array, so there is no real need to them. – Soul Bruteflow May 16 '17 at 12:56
  • You're right, you don't need mutex here – ShellCode May 16 '17 at 12:57
  • `tmp.array[i][j] = (int)malloc(sizeof(int))` is a bit odd for me. `malloc` returns a pointer as far as I remember. Does it work properly when You simply cast it in a way You do? – Dori May 16 '17 at 12:57
  • @riodoro1 yes, that why I use two loops to allocate the memory fully. I think if the allocation was totally wrong I wouldn't be able to run the program. – Soul Bruteflow May 16 '17 at 12:58
  • @Matso yes it works properly. I will chek it thoroughly later. But program runs ok, and there are no errors. – Soul Bruteflow May 16 '17 at 13:00
  • @DeadBigHead he's right, you definitely have some problems with the allocations – ShellCode May 16 '17 at 13:00
  • @DeadBigHead Yes it will slow it down. But this is a safer way to use threads. But actually I think your problem comes from allocations too. – Silveris May 16 '17 at 13:01
  • 2
    **All of these mallocs are very wrong**. [One **never** casts the return value of `malloc` in C](http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc) - in your case the cast in `(int)malloc(sizeof(int))` masked a *valid* warning / error. – Antti Haapala -- Слава Україні May 16 '17 at 13:02
  • @AnttiHaapala most of compilers produce a warning if you don't cast malloc – ShellCode May 16 '17 at 13:06
  • @ShellCode: Those then most likely aren't C compilers, but C++ compilers. – alk May 16 '17 at 16:11
  • If no elements are accessed concurrently with at least one party writing, no protection by for example the proper use of a mutex is necessary. – alk May 16 '17 at 16:14
  • @alk i'm almost sure I had seen gcc telling me that I needed to cast malloc... But it was years ago so i'm not 100% sure – ShellCode May 16 '17 at 16:30
  • @ShellCode: It did and still does require a cast if you miss/ed to include `` and aren't on a system where `int` and `void*` are of the same size ... ;-) – alk May 16 '17 at 17:57

2 Answers2

1

You are passing local variable tmp as an argument to the thread and changing it in a loop at the same time. This is a data race and your threads most probably will operate over the same data.

Convert tmp to an array, fill and pass a corresponding element to a corresponding thread.

Andriy Berestovskyy
  • 8,059
  • 3
  • 17
  • 33
1

Try something like that :

int main()
{
    pthread_t* threads = (pthread_t*)malloc(NTHREADS * sizeof(pthread_t));
    t_asd tmp;
    int i, j;

    tmp.array = (int**)malloc(NTHREADS * sizeof(int*));
    for (i = 0; i <= NTHREADS; i++)
    {
        tmp.array[i] = (int*)malloc(ELEMENTS * sizeof(int));

        //can be deleted if you want
        for (j = 0; j <= ELEMENTS; j++)
            tmp.array[i][j] = 0;
    }

    for (tmp.tid = 0; tmp.tid < NTHREADS; tmp.tid++) {
        t_asd *arg = (t_asd *) malloc(sizeof(t_asd));
        memcpy(arg, &tmp, sizeof(t_asd)); //will copy the current tid and the pointer to the array in a new memory area
        pthread_create(&threads[tmp.tid], NULL, worker, arg);
    }

    for (i = 0; i < NTHREADS; i++)
        pthread_join(threads[i], NULL);

    print_array(tmp.array);
    return 0;
}

Of course this is an example and you have to free all the allocations

ShellCode
  • 1,072
  • 8
  • 17
  • This code works. And if I am understanding correctly, this is basically the same as the method Andriy Berestovskyy suggested, just different implementation? But I have one question. If `tmp` instead of stack allocated struct will be dynamically allocated pointer to a struct. And it will be big. (Containing bunch of pointer and variables) Will memcpy take a long time to copy? – Soul Bruteflow May 16 '17 at 13:54
  • Yes it's the same idea Andriy and I are suggesting. I'm pretty sure the complexity of memset will never exceed O(n) but I think you don't have to worry about this, because you are just copying pointers, not all the data – ShellCode May 16 '17 at 14:00
  • Thank you very much. Will try this implementation, with fractal rendering. Will report when it will be ready. :) For now, I will mark this question answered. – Soul Bruteflow May 16 '17 at 14:03