C multithreading slower than single-threading when multiplying matrices

Question

I'm using theads in my C code to make the code faster, but it actually makes it worse.

I have a matrix and a matrix_operation class :

struct matrix
{
  char *name;
  size_t rows;
  size_t columns;
  double *value;
};

typedef struct matrix_operation matrix_operation;

struct matrix_operation
{
  matrix r;
  matrix m1;
  matrix m2;
  size_t row; 
};

To multiply the matrices, I use these functions :

matrix matrix_mul(char *name, matrix m1, matrix m2, size_t replace)
{
  matrix r = matrix_init(name, m1.rows, m2.columns);

  matrix_operation *mat = malloc(sizeof *mat * m1.rows);
  pthread_t *th = malloc(sizeof *th * m1.rows);

  for (size_t i = 0; i < m1.rows; i++)
  {
    matrix_operation param = {r, m1, m2, i};
    mat[i] = param;
    pthread_create(&th[i], NULL , matrix_mul_th, &mat[i]);
  }

  for (size_t i = 0; i < m1.rows; i++)
  {
        pthread_join(th[i], NULL);
  }

  free(mat);
  free(th);

  if (replace == 1)
    matrix_free(m1);
  else if (replace == 2)
    matrix_free(m2);
  else if (replace == 3)
  {
    matrix_free(m1);
    matrix_free(m2);
  }

  return r;
}

void *matrix_mul_th(void *arg)
{
  matrix_operation mat = *(matrix_operation*)arg;

  for (size_t j = 0; j < mat.m2.columns; j++)
    {
      double sum = 0;
      for (size_t k = 0; k < mat.m1.columns; k++)
        sum += matrix_get(mat.m1,mat.row,k) * matrix_get(mat.m2,k,j);
      matrix_put(mat.r,mat.row,j,sum);
    }

  return NULL;
}

Do you have any idea why the problem may be ? And how to improve the code ? The matrices are stored as a 1D array.

Thanks a lot for your time, Lucas

So each thread is just multiplying one matrix and then terminating? How big are the matrices? What measurements did you make to determine that the threaded version is slower? Does your computer have more than one core? If the total computation time is very small, breaking it up into threads might not help. I suggest posting a [mcve] and a brief description of the actual problem you are trying to solve if you want more specific answers. — David Grayson, Dec 01 '18 at 00:42
in general, threads are best when different things are to be performed. Interlacing functionality across threads (almost) always makes the code execute slower — user3629249, Dec 01 '18 at 00:57
@user3629249 and benefits are more clear for bigger inputs as well ;) — niceman, Dec 01 '18 at 01:49
I recently played a bit with multi-threading and matrix multiplication (but it was [tag:c++]). However, it might be interesting as I compared some more or less naive approaches: [SO: Multi-threading benchmarking issues](https://stackoverflow.com/a/52835213/7478597). — Scheff's Cat, Dec 01 '18 at 07:03
The matrices are pretty big actually somewhere arround (800*1)*(150*800) most of the time. The computer is a ubuntu server VM running on windows, the VM usually takes 20% CPU on windows without multi threading and about 40% with it. The difference is noticeable without measurements, it takes ages with multithreading. — Lucas Arnulphy, Dec 01 '18 at 09:49
I also noticed that the threads might be tryng to read the same parts of the input matrices, but I have no idead to make it multiply them using threads without doing so — Lucas Arnulphy, Dec 01 '18 at 09:50

C multithreading slower than single-threading when multiplying matrices

0 Answers0