2

I have written the following program for matrix multiplication:

#include <stdio.h>
#include <omp.h>
#include <time.h>
#define NRA 500
#define NCA 500
#define NCB 500

int mat_mul() ;
int i , j , k ;
int main(){
    double start , end ;
    start  =  omp_get_wtime() ;
    mat_mul() ;
    end = omp_get_wtime() ; 
    printf("Time taken : \n %lf " , (end - start) );
    return 0;
}

int mat_mul(){
    int mat1[NRA][NCA] , mat2[NCA][NCB] , mat3[NRA][NCB] ;
    //double start , end ;
    //start  =  omp_get_wtime() ;
    #pragma omp parallel private( i , j , k ) shared(mat1 , mat2 , mat3)
    #pragma omp for

    for( i = 0 ; i < NRA ; i++){
        for( j = 0 ; j < NCB ; j++){
            mat1[i][j] = mat2[i][j] = rand() ;
            for ( k = 0 ; k <  NCB ; k++){
                mat3[i][k] += mat1[i][k] * mat2[k][j] ;
            }
        }
    }
    //end = omp_get_wtime() ; 
    printf("REsult : \n");
    for( i = 0 ; i < NRA ; i ++ ) {
        for( j =0 ; j < k ; j ++)
            printf("%lf" , (double)mat3[i][j]);
        printf("\n") ;
    //printf("Time taken : \n %lf " , (end - start) );
    }
    return 0 ;
}

Everything's fine(almost): It compiles, it executes, even terminates :-D (and it ACTUALLY provides a speed-up(Compare.

But unfortunately, the output appears somethinf like this:

[tejas@localhost Documents]$ gcc -pedantic -Wall  -std=c99 -fopenmp  par.c
par.c: In function ‘mat_mul’:
par.c:28:30: warning: implicit declaration of function ‘rand’ [-Wimplicit-function-declaration]
    mat1[i][j] = mat2[i][j] = rand() ;
                              ^~~~
[tejas@localhost Documents]$ ./a.out
REsult :
...
(LOTS of blank spaces later)
...
Time taken : 
0.177506 [tejas@localhost Documents]$ 

What am I doing wrong? I'm using Fedora 27 , GCC

Thank you in advance.

  • 2
    LOTS of blank spaces = 500? Because that would make sense. Step through this with a debugger and/or insert `printf`s inside the loops to see why the inner loop prints nothing. – Jongware Jan 18 '18 at 09:19
  • It actually prints when I comment out the pragmas(i checked that just before posting this question). I didn't bother calculating, but the blanks should be 250000. – Tejas Garhewal Jan 18 '18 at 09:23
  • (No they would not. That loop only loops over `NRA`, which is 500.) If commenting out the optimize-parallel pragmas makes it work then the conclusion can only be that trying to print stuff in parallel processing is not a good idea. Switch the parallel processing off before printing, or use other variables there. – Jongware Jan 18 '18 at 09:56
  • Thank you for bringing that to my attention. It's actually printing now(The `k` variable is not supposed to be at the last loop, it's supposed to be `NCB`...)... and my speedup has gone away ;-; . I will try measuring performance again but without using printf in either one of the programs this time. I have no idea how to use gdb(started programming recently) and the documentation looks... intimidating. – Tejas Garhewal Jan 18 '18 at 10:07
  • https://stackoverflow.com/questions/10624755/openmp-program-is-slower-than-sequential-one – Zulan Jan 18 '18 at 10:09
  • @Zulan I haven't yet done what you've said(or directed me towards), but I commented out the `printf()` in both(the seq. and parallel implementations) and now I got my speed up back : - D . Why in the world is `printf()` so harmful for performance I wonder... Anyways, I'll edit this comment when I've done what you said – Tejas Garhewal Jan 18 '18 at 10:18
  • (New comment since I can't edit the previous one anymore)Seems to have done nothing for me – Tejas Garhewal Jan 18 '18 at 10:26
  • It doesn't matter what you see or don't see. Using `rand` in multiple threads is **wrong**. – Zulan Jan 18 '18 at 12:39
  • okay, I'll keep that in mind. Thank you :-D – Tejas Garhewal Jan 18 '18 at 12:48
  • If you want matrix multiplication you should use `mat3[i][j] += mat1[i][k] * mat2[k][j]`. `C(i,j) = sum over k A(i,k)*B(k,j).` – Z boson Jan 19 '18 at 12:59

1 Answers1

3

It's because of j < k condition in the loop:

for( j =0 ; j < k ; j ++)

The value of k differs when using openmp. Can you just replace k with NCB?

Anton Malyshev
  • 8,686
  • 2
  • 27
  • 45