0

I'm trying to use OpenMP to parallelize quicksort (for arrays with distinct integers). I already have a working implementation (I tested the output for multiple threads and it seems to work). The problem is that I'm not seeing any speedup, even for large arrays which leads me to believe my implementation is not correct. Since the actual implementation is correct, but the parallelism doesn't seem to produce the desired speedup, I will show only the parallelization of the code:

void parallel_randomized_quicksort(vector<int>& A, int start, int end){
  if ((end-start) is too small){
    run a serial sorting algorithm
  }else{
    pick a random pivot x and partition A around that pivot.
    k = index of x in A
    #pragma omp parallel
    {
      #pragma omp single nowait
      {
        parallel_randomized_quicksort(A,start,k-1);
      }
      #pragma omp single nowait
      {
        parallel_randomized_quicksort(A,k+1,end);
      }
    }
  }
}

If this is correct, there should be some speedup, since the left and right partitions can be recursed in parallel which is the goal of my code. Why don't I notice speedup?

EDIT: Execution time is measured using:

double start_time = omp_get_wtime();
  parallel_randomized_quicksort(A,0,A.size()-1);
  double time = omp_get_wtime() - start_time;

The arrays are distinct integers. Sizes of the arrays vary from 100 to 1,000,000. Times are in the milliseconds for the smaller arrays and a couple seconds for the larger ones. Typically if the array is less than size 32 I will use a simple serial algorithm like insertion sort.

user308485
  • 611
  • 1
  • 5
  • 25
  • Do you know how quicksort works? The way you pick your pivot is perhaps the worst way to pick one. What happens if x is the last element of your array? Would parallelism make sense then? – Sailanarmo May 14 '18 at 04:55
  • While this may not be the best pivot choice, it should still provide at least some speedup (theoretically since the expected span is still less than that for the serial algorithm). The pivot choice is unlikely to be the reason I'm not getting much speedup, it's probably something to do with my OpenMP syntax but I'm not sure what. – user308485 May 14 '18 at 05:31
  • 1
    For recursively parallel tasks like this you should be using OpenMP tasks... – Jim Cownie May 14 '18 at 09:47
  • 2
    Possible duplicate of [C OpenMP parallel quickSort](https://stackoverflow.com/questions/8023135/c-openmp-parallel-quicksort) – chtz May 14 '18 at 11:13
  • hi there, I hope your works in openmp is going nicely. This one seems trying to use a similar approach to nesting. so please first refer to my answer to [your other question about fibonacci](https://stackoverflow.com/questions/50340903/c-openmp-fibonacci-1-thread-performs-much-faster-than-4-threads). In case you haven't noticed I have a few edits there and the code now works perfectly. – Yılmaz Durmaz May 17 '18 at 19:53

0 Answers0