1

I am trying to develop some recursive algorithm which I would like to run in parallel using open mp. I need to run the algorithm for multiple input so I want each of the Thread to run 1 input. Each Thread is independent but the results are stored in the same global variable (summing them) as the code shows:

#include <stdio.h>
#include <omp>

double * resultA;
double * resultB;

void recursiveAlgorithm(double a)
{
    double b = someFunction(a);
    uint c = anotherFunction(a);

    if (b < 0)
    {
         #pragma omp critical
         {
              resultA[c] = resultA[c] + b;
         }
         return;
    }
    if (b > 100)
    {
         #pragma omp critical
         { 
              resultB[c] = resultB[c] + b;     
         } 
         return;
    } 
    recursiveAlgorithm(b);
}


int main( int argc, const char* argv[] )
{
    double input[5] = {0, 1, 2, 3, 4};

    resultA = malloc(1000*1000*3, sizeof(double));
    resultB = malloc(1000*1000*3, sizeof(double));

    #pragma omp parallel for
    for (uint i; i < 5; i++){
         recursiveAlgorithm(input[i]);
    }
}

I have been using critical section to ensure that the variables resultA and resultB are not accessed simultaneously but I am not sure if it is the best for my case. The speed improvement is much less that I would expect. Is there any better approach for a code like this?

Dylan
  • 107
  • 6

1 Answers1

1

It sounds like your problem might be better solved with the reduction pattern. But its really hard to tell without more information on what you are actually computing.

See this question on how to do it for two variables and this question for the array case.

Also note that you can always implement your recursion stack yourself and parallelize the individual calls. The obvious benefit is better job balancing between thread if some recursions go much deeper than others.

ypnos
  • 50,202
  • 14
  • 95
  • 141
  • Thanks, the second solve my problem. I wasn't sure if there was any easier way. Regarding the second suggestion I think that in my case I will have better performance this way since in my real code I am running the recursive algorithm a big number of times (like 10000). – Dylan May 04 '19 at 06:42
  • When dealing with recursive parallelism it is often much easier to use tasks than parallel loops... – Jim Cownie May 07 '19 at 11:34
  • @JimCownie yes this is what I wanted to express, implement the recursion as a stack of tasks. – ypnos May 07 '19 at 12:37
  • @ypnos my point is that OpenMP has support for task parallelism. So you don't need to maintain stacks of tasks, just create OpenMP tasks as you need them, and be done. – Jim Cownie May 07 '19 at 13:05