0

I have a function in C which I have to parallelize using OpenMP with static scheduling for n threads

void resolveCollisions(){
    int i,j;
    double dx,dy,dz,md;
    for(i=0;i<bodies-1;i++)
        for(j=i+1;j<bodies;j++){
            md = masses[i]+masses[j];
            dx = fabs(positions[i].x-positions[j].x);
            dy = fabs(positions[i].y-positions[j].y);
            dz = fabs(positions[i].z-positions[j].z);
            if(dx<md && dy<md && dz<md){
                vector temp = velocities[i];
                velocities[i] = velocities[j];
                velocities[j] = temp;
            }
        }
}

So in order to parallelize this I added a #pragma omp parallel for directive to parallelize the outer loop across the n threads. I also added the static scheduling tag which I have to use. I also put the num_threads(n) which takes the n from the function parameters to know the desired number of threads. I also thought about adding a critical section to prevent race conditions when updating the velocities array.

void resolveCollisions_openMP-static(int n) {
    int i, j;
    double dx, dy, dz, md;
    #pragma omp parallel for schedule(static) num_threads(n)
    for (i = 0; i < bodies - 1; i++) {
        for (j = i + 1; j < bodies; j++) {
            md = masses[i] + masses[j];
            dx = fabs(positions[i].x - positions[j].x);
            dy = fabs(positions[i].y - positions[j].y);
            dz = fabs(positions[i].z - positions[j].z);
            if (dx < md && dy < md && dz < md) {
                vector temp = velocities[i];
                #pragma omp critical
                {
                    velocities[i] = velocities[j];
                    velocities[j] = temp;
                }
            }
        }
    }
}

When I run this function though it gives me wrong results. I imagine that it has something to do with the inner loop using i, in order to give value to j in j=i+1. I don't know how to approach to fix this or if this is the actual issue or if it's not. I would appreciate any help. Thank you

Chris Costa
  • 653
  • 4
  • 15
  • 3
    Start by declaring your loop variables in the loop headers. – Victor Eijkhout Feb 17 '23 at 08:44
  • 2
    you have a lots of data race (`j,dx, dy, dz`) if you make them private, and move `vector temp = velocities[i];` into the critical section then the race conditions are resolved. The only question is that if the result depends on the order of execution or not (because of the swapping of values). If it has a dependency then it cannot be parallelized efficiently. – Laci Feb 17 '23 at 09:30
  • @Laci Even if it wouldn't be parallelized efficiently, how could it be done? – Chris Costa Feb 26 '23 at 13:05
  • You can use the ordered clause, if `(dx < md && dy < md && dz < md)` is rarely true you may see a slight speed improvement. – Laci Feb 26 '23 at 14:46

0 Answers0