4

I have recently been introduced to OpenMP and parallel programming, and am having some trouble using it properly.

I want to implement OpenMP on the following code to make it run faster.

int m = 101;
double e = 10;

double A[m][m], B[m][m];
for (int x=0; x<m; x++){
    for (int y=0; y<m; y++){
        A[x][y] = 0;
        B[x][y] = 1;
    }
}

while (e >= 0.0001){
    for (int x=0; x<m; x++){
        for (int y=0; y<m; y++){
            A[x][y] = 0.25*(B[x][y] - 0.2);
        }
    }
    e = 0;
    for (int x=0; x<m; x++){
        for (int y=0; y<m; y++){
            e = e + abs(A[x][y] - B[x][y]);
        }
    }    
}

I would like to run the loops simultaneously rather than one after another to speed up the run time. I believe the following code should work, but I am not sure if I am using OpenMP correctly.

int m = 101;
double e = 10;

double A[m][m], B[m][m];
#pragma omp parallel for private(x,y) shared(A,B) num_threads(2)
for (int x=0; x<m; x++){
    for (int y=0; y<m; y++){
        A[x][y] = 0;
        B[x][y] = 1;
    }
}

while (e >= 0.0001){
    #pragma omp parallel for private(x,y) shared(A,B) num_threads(2)
    for (int x=0; x<m; x++){
        for (int y=0; y<m; y++){
            A[x][y] = 0.25*(B[x][y] - 0.2);
        }
    }
    // I want to wait for the above loop to finish computing before starting the next
    #pragma omp barrier  
    e = 0;
    #pragma omp parallel for private(x,y) shared(A,B,e) num_threads(2)
    for (int x=0; x<m; x++){
        for (int y=0; y<m; y++){
            e = e + abs(A[x][y] - B[x][y]);
        }
    }    
}

Am I using OpenMP effectively and correctly? Also, I am not sure if I can use OpenMP for my while loop as it requires the inner loops to be computed before It can determine if it need to run again.

dreamcrash
  • 47,137
  • 25
  • 94
  • 117
PiccolMan
  • 4,854
  • 12
  • 35
  • 53

1 Answers1

5

Assuming that code work, here are some improvements that you can make:

int m = 101;
double e = 10;

double A[m][m], B[m][m];

#pragma omp parallel num_threads(2) shared(A, B)
{

    #pragma omp for
    for (int x=0; x<m; x++){
        for (int y=0; y<m; y++){
            A[x][y] = 0;
            B[x][y] = 1;
       }
    }

   while (e >= 0.0001){
    #pragma omp for
    for (int x=0; x<m; x++){
        for (int y=0; y<m; y++){
            A[x][y] = 0.25*(B[x][y] - 0.2);
        }
    }
    
    #pragma omp single
    e = 0;

    #pragma omp for reduction (+:e)
    for (int x=0; x<m; x++){
        for (int y=0; y<m; y++){
            e = e + abs(A[x][y] - B[x][y]);
        }
    }    
  }
}

Instead of creating every time a parallel region, you can improve by only creating one for the entire code. Furthermore, since you are using only 2 threads there are not many load-balancing problems, but if you were to increase the number of threads you may get better performance by using a static scheduling with chunk = 1.

You do not need to make the loop variables x and y private, OpenMP will do that for you. In your last nested loops you have e = e + abs(A[x][y] - B[x][y]); so you probably want for the threads to have the result of adding the 'e', therefore you should use reduction (+:e) to reduce the variable 'e' across the threads.

dreamcrash
  • 47,137
  • 25
  • 94
  • 117