-1

So in an attempt to practice some openMP in C++, I am trying to write a matrix multiply without using #pragma omp parallel for

Here is my matrix multiply skeleton that I am attempting to add tasks to.

#include <omp.h>
#include <cstdio>

void process(double **a, double **b, double **c, int i) {
  for(int j=0;j<1024;j++)
    for(int k=0;k<1024;k++)
      c[i][j] += a[i][k]*b[k][j];
}

void matrix_mult(double **a, double **b, double **c) {

  omp_set_num_threads(4);

  /* do I need to modify some storage attributes here? shared, private etc? */
  #pragma omp parallel 
  {  
    for(int i=0;i<1024;i++) {

      #pragma omp task 
      {
        process(a,b,c,i);
      }
    }
  }
}

I have been working through some openMP overviews and examples but am having a tough time applying concepts to my code here. I keep getting an incorrect matrix result when I use more than 1 thread. What can I do to fix this? Thanks!

user2981824
  • 111
  • 1
  • 1
  • 5
  • Intel have a really detailed document describing how to do a matrix-matrix multiple using Strassen's algorithm and using OMP tasks. – Fuzz Sep 20 '14 at 04:47
  • That's not C++ at all and not good C, either. C++ has specific template types for vectors and stuff, if you are doing C++ use them. C has real 2D matrices and you shouldn't use fake matrices like these ones. For the problem itself you should make a good separation into chunks, such that any item of `C` is only touched (read and written) by the same task and not by several. – Jens Gustedt Sep 20 '14 at 07:25

2 Answers2

1

You have a logical error: you won't generate 1024 tasks, you rather generate (# threads)*1024 of them since each thread runs the for-loop inside the parallel region. Just put this for-loop inside a single region.

Vassilios
  • 11
  • 1
0

I don't have an OpenMP 3.0 compiler with me right now, but I would suggest a majority of the problem you are having comes from the issue of the accumulation on c[i][j].

Before doing the += each threads stack may have stored a different value of c[i][j] resulting in an erroneous accumulation.

There are answers to similar issues on SO, including: Matrix multiplication by vector OpenMP C

In essence you will need to change the accumulation so that you take a copy of certain rows, then combine results together in a critical section.

Community
  • 1
  • 1
Fuzz
  • 1,805
  • 17
  • 24