0

I have a problem with a function for splitting a matrix into to matrix depending if it's bigger or smaller than a median_point. The functions work fine without openMP, but with it, I can get segmentation error after a while. The function is used in a three-ball algorithm, so it's being called multiple times. And after a couple of iterations, it can break down, because it gets wrong values in the pts matrix for some reason. Any Idea why?


 /******************************************************************************
 *
 *                          create_L_and_R():
 *
 * -    create two sets of points, L and R. The point smaller than the median 
 *      point goes to the left (L) and bigger (and including the point) goes to 
 *      the right (R).
 * 
 *  - input: 
 *      -n_dims: dimension of the points
 *      -n_points: number of points in the pts
 *      - median_point
 *      -pts: pts of the line 
 *      - Two empty arrays:
 *            L_split: empty array which will be filled with points
 *            R_split: empty array which will be filled with points, pluss the center point
 *****************************************************************************/

void create_L_and_R(int n_dims, int n_ponits, double** pts, double** L_split, double** R_split, double *median_point, int *counter_L, int *counter_R){
        int n = 0;
        printf("\nMedian: %f\n", median_point[0]);
        #pragma omp parallel for private(n) \
                shared(L_split, R_split, pts, counter_L, counter_R, median_point)
        for (n=0; n<n_ponits; n++){
                printf("pts[n][dim]: %f\n", pts[n][0]);
                if (pts[n][0] < median_point[0]){
                        L_split[*counter_L] =pts[n]; 
                        printf("L_split: %f\n", L_split[*counter_L][0]);
                        *counter_L = *counter_L + 1;
                }
                else{
                        R_split[*counter_R] = pts[n]; 
                        printf("R_split: %f\n", R_split[*counter_R][0]);
                        *counter_R = *counter_R + 1;
                 }

        }
        printf("\n");
        return;
}

   

Output when running with 50 points: edian: 0.086492 pts[n][dim]: 0.697553 R_split: 0.697553 pts[n][dim]: 0.000000 L_split: 0.000000 pts[n][dim]: 1.372316 R_split: 1.372316 pts[n][dim]: 1.416026 R_split: 1.416026 pts[n][dim]: 0.000000 L_split: 0.000000 pts[n][dim]: 0.641713 R_split: 0.641713 pts[n][dim]: 1.297904 R_split: 1.297904

Median: -nan pts[n][dim]: 0.000000 R_split: 0.000000 pts[n][dim]: 0.000000 R_split: 0.000000 Thread 1 "ballAlg-omp.o" received signal SIGSEGV, Segmentation fault.

paleonix
  • 2,293
  • 1
  • 13
  • 29
Marcusnh
  • 33
  • 8
  • What is `R_split[*counter_R] >= pts[n];` supposed to do? It compares two pointers, and then discard the result. – Some programmer dude Apr 12 '21 at 15:35
  • As for the crash, please try to create a [mcve] to show us. How do you call this function? What are the initial values of the indexes used? How are the "arrays" of pointers initialized? – Some programmer dude Apr 12 '21 at 15:36
  • R_split[*counter_R] >= pts[n]; should't be in the code. Corrected it now – Marcusnh Apr 12 '21 at 15:50
  • It's difficult to provide a minimal reproducible example when the problem arises after a couple of iterations over the same function. The idea is that I take in a matrix array of the form [ ] [ ] and reduce it to two matrices of the same form. Then I insert these new matrixes into the same function after some additional computation – Marcusnh Apr 12 '21 at 15:53
  • Perhaps the arrays are the problem? [An array of arrays is not the same as a pointer to pointer](https://stackoverflow.com/questions/18440205/casting-void-to-2d-array-of-int-c/18440456#18440456). – Some programmer dude Apr 12 '21 at 15:58
  • Very certain that the arrays are not the problem. Can run the program without openMP. And have used a very standard way of creating the matrices – Marcusnh Apr 12 '21 at 16:07
  • If you have arrays of arrays, then it's most certainly a problem. Also remember that C doesn't have dynamic arrays, if `L_split` and `R_split` are too small then you're also going to have problem (as will you if any of the pointers are null or uninitialized). But then, I'm only guessing since there's just not enough information to give certain answers. – Some programmer dude Apr 12 '21 at 16:24
  • 1
    There is a race condition on `*counter_R` and `*counter_L`. But even if you fix that, multiple threads will still be able to write to the same location before incrementing it. You basically have to wrap the complete thing into a critical section which would effectively sequentialize it. To do this in parallel, you will need a different algorithm. You could first create an array of positions for each thread to start writing to, such that the counters are private. The question is if this would actually give you better performance. – paleonix Apr 12 '21 at 18:11
  • @Someprogrammerdude as he is splitting around a median, the number of elements on either side is probably known beforehand. – paleonix Apr 12 '21 at 18:15
  • 1
    As you are only copying a pointer, your problem is without any doubt bandwidth bound and not compute bound. Therefore you wont achieve much speedup via parallelization anyway. – paleonix Apr 12 '21 at 18:28
  • I see. Maybe I have to rethink the structure of the code. Thanks for the feedback! – Marcusnh Apr 13 '21 at 09:41

0 Answers0