1

I am trying to integrate a function of curve, and convert the serial code to parallel program, I am using openMP for the same.

I have parallelized the for loop using openMP parallel for and have achieved lesser program time, but the problem is the result is not the expected one, there is something which get messed up in the threads, I want to know how to parallelize the for loop for N number of threads.

#include <stdio.h>
#include <omp.h>
#include <math.h>

double f(double x){
  return sin(x)+0.5*x;
}


int main(){
  int n=134217728,i;
  double a=0,b=9,h,x,sum=0,integral;

  double start = omp_get_wtime();
  h=fabs(b-a)/n;

  omp_set_dynamic(0);
  omp_set_num_threads(64);
  #pragma omp parallel for reduction (+:sum) shared(x)
  for(i=1;i<n;i++){
    x=a+i*h;
    sum=sum+f(x);
  }

  integral=(h/2)*(f(a)+f(b)+2*sum);
  double end = omp_get_wtime();
  double time = end - start;
  printf("Execution time: %2.3f seconds\n",time);
  printf("\nThe integral is: %lf\n",integral);
}

The expected output is 22.161130 but it is getting varied each time the program is ran.

Ajay
  • 21
  • 6
  • 1
    Your parallel loop accumulating on the global var sum and using as a temporary the shared variable x cannot work because of races. Use a [reduction](https://stackoverflow.com/questions/13290245/reduction-with-openmp) and declare private variables if you need temporaries. Also, you really have 64 cores? – Alain Merigot Jun 24 '19 at 15:10
  • Possible duplicate of [How to parallelize this array sum using OpenMP?](https://stackoverflow.com/questions/27056090/how-to-parallelize-this-array-sum-using-openmp) – mch Jun 24 '19 at 15:10
  • Use atomic updates on the sum. – Sıddık Açıl Jun 24 '19 at 15:11
  • have updated the code with reduction and shared var x, still there is a difference – Ajay Jun 24 '19 at 15:20
  • 2
    x *must* be private. – Alain Merigot Jun 24 '19 at 15:28
  • @AlainMerigot Thanks, that solved the problem :) – Ajay Jun 24 '19 at 17:28

1 Answers1

-2

The loop you are trying to parallelise modifies the same variables x and sum in each iteration, this is very cumbersome to parallelize.

You could rewrite the code to make the path to parallelisation more obvious:

#include <stdio.h>
#include <omp.h>
#include <math.h>

double f(double x) {
    return sin(x) + 0.5 * x;
}

int main() {
    int n = 1 << 27, i, j;
    double a = 0, b = 9, h, x, sum, integral;
    double sums[64] = { 0 };

    double start = omp_get_wtime();
    h = fabs(b - a) / n;

    omp_set_dynamic(0);
    omp_set_num_threads(64);
    #pragma omp parallel for

    for (j = 0; j < 64; j++) {
        for (i = 0; i < n; i += 64) {
            sums[j] += f(a + i * h + j * h);
        }
    }
    sum = 0;
    for (j = 0; j < 64; j++) {
        sum += sums[i];
    }

    integral = (h / 2) * (f(a) + f(b) + 2 * sum);
    double end = omp_get_wtime();
    double time = end - start;
    printf("Execution time: %2.3f seconds\n", time);
    printf("\nThe integral is: %lf\n", integral);
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 2
    I don't understand why you've dropped the reduction from the code. I think your answer would be better with an explanation of that. – High Performance Mark Jun 24 '19 at 15:51
  • 3
    This is great example of how not to do this. This code is the standard example of false sharing and will result to more 100x slow down on a wide machine. Please don't try this at home (unless you want to teach people about race conditions). – Michael Klemm Jun 24 '19 at 16:32
  • @MichaelKlemm: thank you for your comment. Indeed I don't do parallel computing and this suggestion is both naive and inappropriate. I shall delete this answer, but you might want to share your expertise by posting an effective answer. – chqrlie Jun 25 '19 at 10:49
  • Nah, leave the response. It's actual educational, so I think it will be fine to keep it. – Michael Klemm Jun 25 '19 at 12:27