-1

My mergesort algorithm works correctly, i.e. it sorts the values from the input array. However the reference array that is there to controle whether the array is really sorted produces values that were not in the original array that should be sorted and because of that the compiler throws an error. What can I do to solve this problem?

I have configured the program so that it sorts one specific array, it has 103 values, I tried to reduce the size but then the program runs successfully.

This is the input array:

[1919243808,365,372,383,394,404,413,170931,667182,540028976,741551154,774455913,774778478,778265971,840969273,892416309,942815278,1292503603,1667851881,1869182049,1919243808,35,20,20,50,50,52,53,54,54,55,55,58,63,64,64,65,388,401,65,71,73,73,74,75,75,77,289,290,290,292,300,303,308,308,308,267,312,312,314,314,315,323,323,325,330,332,333,338,339,347,347,361,221,372,383,394,404,331,331,413,2,741551154,892416309,875310137,808531502,540028976,778265971,1919243808,1667851881,1869182049,774778478,1292503603,774455913,667182,356,359,361,362,363,364,365,741684787]

It is saved in the reference array at the beginning of the program. To controle whether the change occurs at this point I printed out the reference array, and it is still the same as the input array.

Then the mergesort algorithm runs on the input array.

It gives the output array:

[2,20,20,35,50,50,52,53,54,54,55,55,58,63,64,64,65,65,71,73,73,74,75,75,77,221,267,289,290,290,292,300,303,308,308,308,312,312,314,314,315,323,323,325,330,331,331,332,333,338,339,347,347,356,359,361,361,362,363,364,365,365,372,372,383,383,388,394,394,401,404,404,413,413,170931,667182,667182,540028976,540028976,741551154,741551154,741684787,774455913,774455913,774778478,774778478,778265971,778265971,808531502,840969273,875310137,892416309,892416309,942815278,1292503603,1292503603,1667851881,1667851881,1869182049,1869182049,1919243808,1919243808,1919243808]

I checked with a helping program whether the output array is sorted, and it is really sorted.

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/time.h>

#include <iostream>

int arraySortedCheck(int arr[], int n){
   for (int i = 0; i < n; ++i){
      //when an array is not in sorted order
      if(arr[n-1] < arr[n-2])
         return 0;
   }
   //all elements are checked and
   //all are in sorted order
   return 1;
}
int main(int argc, char const *argv[]){
   int arr[103]{2,20,20,35,50,50,52,53,54,54,55,55,58,63,64,64,65,65,71,73,73,74,75,75,77,221,267,289,290,290,292,300,303,308,308,308,312,312,314,314,315,323,323,325,330,331,331,332,333,338,339,347,347,356,359,361,361,362,363,364,365,365,372,372,383,383,388,394,394,401,404,404,413,413,170931,667182,667182,540028976,540028976,741551154,741551154,741684787,774455913,774455913,774778478,774778478,778265971,778265971,808531502,840969273,875310137,892416309,892416309,942815278,1292503603,1292503603,1667851881,1667851881,1869182049,1869182049,1919243808,1919243808,1919243808};
   int n = sizeof(arr)/sizeof(arr[0]);
   if(arraySortedCheck(arr, n)){
      printf("Array is in sorted order\n");
   }
   else
      printf("Array is not in sorted order\n");
   return 0;
}

Sortcheckoutput:

Array is in sorted order

In the verification-phase of my program, the reference array is sorted with a library function and then compared to the output array from the mergesort algorithm, however this sorted reference array has values that were not there before

"sorted reference array"

[2,20,20,35,50,50,52,53,54,54,55,55,58,63,64,64,65,65,71,73,73,74,75,75,77,221,267,289,290,290,292,300,303,308,308,308,312,312,314,314,315,323,323,325,330,331,331,332,333,338,339,347,347,356,359,361,361,362,363,364,365,365,372,372,383,383,388,394,394,401,404,404,413,413,170931,667182,667182,540028976,540028976,741551154,741551154,774455913,774455913,774778478,774778478,778265971,778265971,808531502,840969273,875310137,892416309,892416309,942815278,959515229,1292503603,1292503603,1667851881,1667851881,1869182049,1869182049,1919243808,1919243808,959527217]

Values that are new are for example 959527217 or 959515229

And therefore the verification says that the array is not the sorted input array even tough it is

This is the code:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/time.h>

#include <iostream>
#include <algorithm>

#include <cstdlib>
#include <cstdio>

#include <cmath>
#include <ctime>
#include <cstring>
#include <omp.h>
// Constants.h
#if !defined(MYLIB_CONSTANTS_H)
#define MYLIB_CONSTANTS_H 1



#endif


//Takes a sorted list of size n and a value, puts the value in one of n+1 possible positions,
//if value is same to an element of the list take the position before the first occurence of the same element

binarysearchfindlowerrank(int *in,int n,int value,int projection){

    int* array= in+projection;
    int L=0;
    int R=n;

    printf("\nLowerBinarysearchrankfinder: [");
    for(int i=0;i<n; i++){
        printf("%d,",array[i]);

    }
        printf("]\n");
    while(R-L>1){
        int middle = (R+L)/2;
        printf("L:%d,middle:%d,R:%d,value:%d\n",L,middle,R,value);
        if(array[middle]==value){
            while(array[middle]==value&&middle>0){
                middle=middle-1;

            }
            if(middle==0&&array[middle]>=value){
                return 0;
            }
            else{
             printf("L:%d,middle:%d,R:%d,result:%d,index:%d\n",L,middle,R,(middle+1),(middle+1+projection));
            return middle+1;

            }
        }
        if(array[middle]<value){
            L=middle;

        }
        if(array[middle]>value){
            R=middle;

        }
    }
    printf("L:%d,R:%d,value:%d\n",L,R,value);
    if(n==1){
        if(array[0]>=value){
            return 0;

        }
        else return 1;

    }

    if(L==0&&array[L]>value){
         printf("L:%d,R:%d,result:%d,index:%d\n",L,R,0,0+projection);
        return 0;

    }
    if(R==n && array[R-1]< value){
         printf("!!L:%d,R:%d,result:%d,index:%d\n",L,R,n,n+projection);
        return n;

    }
    if(R==n&& array[R-1]>=value){
        printf("L:%d,R:%d,result:%d,index:%d\n",L,R,R-1,(R-1+projection));
        return R-1;

    }
    if(array[R]<value){
         printf("L:%d,R:%d,result:%d,index:%d\n",L,R,R+1,(R+1+projection));
        return R+1;

    }
    if(array[L]<value){
         printf("L:%d,R:%d,result:%d,index:%d\n",L,R,R,(R+projection));
        return R;

    }
     printf("L:%d,R:%d,result:%d,index:%d\n",L,R,L,(L+projection));
    return L;
}


//Takes a sorted list of size n and a value, puts the value in one of n+1 possible positions,
//if value is same to an element of the list take the position after the last occurence of the same element


binarysearchfinduperrank(int *in,int n,int value, int projection){

    int* array= in+projection;
    int L=0;
    int R=n;

     printf("\nUpperBinarysearchrankfinder [");
     for(int i=0;i<n; i++){
         printf("%d,",array[i]);

    }
    printf("]\n");
    while(R-L>1){
        int middle = (R+L)/2;
        printf("L:%d,middle:%d,R:%d,value:%d\n",L,middle,R,value);
        if(array[middle]==value){
            while(array[middle]==value&&middle<n){
                middle=middle+1;

            }
            printf("L:%d,R:%d,result:%d,index:%d\n",L,R,middle,middle+projection);
            return middle;

        }
        if(array[middle]<value){
            L=middle;

        }
        if(array[middle]>value){
            R=middle;

        }
    }

     printf("L:%d,R:%d,value:%d\n",L,R,value);

     if(n==1){
         if(array[0]> value){
             printf(" result:0\n,index:%d\n",0+projection);
             return 0;

        }
        else{
            printf(" result:1,index:%d\n",1+projection);
            return 1;

        }

    }

    if(L==0&&array[L]>value){
        printf("L:%d,R:%d,result:%d,index:%d\n",L,R,0,projection);
        return 0;

    }
    if(R==n && array[R-1]<= value){
         printf("L:%d,R:%d,result:%d,index:%d\n",L,R,n,n+projection);
        return n;

    }
    if(R==n&& array[R-1]>value){
         printf("L:%d,R:%d,result:%d,index:%d\n",L,R,R-1,((R-1)+projection));
        return R-1;

    }
    if(array[R]<=value){
         printf("L:%d,R:%d,result:%d,index:%d\n",L,R,R+1,(R+1+projection));
        return R+1;

    }
    if(array[L]<=value){
        printf("L:%d,R:%d,result:%d,index:%d\n",L,R,R,R+projection);
        return R;

    }

    printf("L:%d,R:%d,result:%d,index:%d\n",L,R,L,L+projection);
    return L;


}

/**
  * helper routine: check if array is sorted correctly
  */
bool isSorted(int ref[], int data[], const size_t size){
    std::sort(ref, ref + size);
    for (size_t idx = 0; idx < size; ++idx){
        if (ref[idx] != data[idx]) {
            printf("\nFalscher Index:%d\n",idx);
            return false;
        }
    }
    return true;
}

/**
  * sequential merge step (straight-forward implementation)
  */
void MsMergeSequential(int *out, int *in, long begin1, long end1, long begin2, long end2, long outBegin,int *data,int *tmp) {

    if(begin1==end2){
        out[begin1]=in[begin1];
        printf("\n[%d]\n",out[begin1]);
    }
    if(begin1==begin2||begin2==end2){
        out[begin1+binarysearchfinduperrank(in,1,in[end2],begin1)]=in[end2];
        out[begin1+binarysearchfindlowerrank(in,1,in[begin1],end2)]=in[begin1];
        printf("\n[%d,%d]\n",out[begin1],out[end2]);
        printf("\nDATA:n[");
        for (size_t idx = 0; idx < 103; ++idx){
             printf("%d,",data[idx]);
        }
        printf("]\n");
        printf("\nTMP:[");
        for (size_t idx = 0; idx < 103; ++idx){
            printf("%d,",tmp[idx]);
        }
        printf("]\n");
    }



    else{
        printf("begin:%d,middle1:%d:,middle2:%d,end:%d\n",begin1,end1,begin2,end2);



        for(int i=0;i<(end2-begin2);i++){

            out[begin1+i+binarysearchfinduperrank(in,(end1-begin1),in[begin2+i],begin1)]=in[begin2+i];
        }
        for(int i=0;i<(end1-begin1);i++){
            out[begin1+i+binarysearchfindlowerrank(in,(end2-begin2),in[begin1+i],begin2)]=in[begin1+i];
        }
        printf("\n[");
        for(int i=0;i<(end2-begin1);i++){
            printf("%d,",out[i+begin1]);
        }
        printf("]\n");
        printf("\nDATA:[");
        for (size_t idx = 0; idx < 11; ++idx){
             printf("%d,",data[idx]);
        }
        printf("]\n");
        printf("\nTMP:[");
        for (size_t idx = 0; idx < 11; ++idx){
             printf("%d,",tmp[idx]);
         }
         printf("]\n");
    }
}
bool myfunc (long i , long j){return (i<j);}
/**
  * sequential MergeSort
  */
void MsSequential(int *array, int *tmp, bool inplace, long begin, long end) {
  if (begin < (end - 1)) {
        long half =(begin+end) / 2;
        MsSequential(array, tmp, !inplace, begin, half);
        MsSequential(array, tmp, !inplace, half, end);

        if (inplace){
            MsMergeSequential(array, tmp, begin, half, half, end, begin,array,tmp);

        }
        else {
            MsMergeSequential(tmp, array, begin, half, half, end, begin,array,tmp);

        }

    }
    else if (!inplace) {
        tmp[begin] = array[begin];
        printf("\n[%d]\n",tmp[begin]);
        printf("\nDATA:[");
        for (size_t idx = 0; idx < 11; ++idx){
             printf("%d,",array[idx]);
         }
         printf("]\n");
         printf("\nTMP:[");
         for (size_t idx = 0; idx < 11; ++idx){
             printf("%d,",tmp[idx]);
         }
         printf("]\n");
    }
}

/**
  * Serial MergeSort
  */
void MsSerial(int *array, int *tmp, const size_t size) {

    MsSequential(array, tmp, true, 0, size);
}

/**

/**
  * @brief program entry point
  */
int main(int argc, char* argv[]) {
    // variables to measure the elapsed time
    struct timeval t1, t2;
    double etime;

    // expect one command line arguments: array size
    if (argc != 2) {
        printf("Usage: MergeSort.exe <array size> \n");
        printf("\n");
        return EXIT_FAILURE;
    }
    else {
        size_t stSize = strtol(argv[1], NULL, 10);
        int *data2 = (int*) malloc(stSize * sizeof(int));
        int *tmp = (int*) malloc(stSize * sizeof(int));
        int *ref = (int*) malloc(stSize * sizeof(int));
        printf("Initialization...\n");

        srand(95);


        for (size_t idx = 0; idx < stSize; ++idx){
            data2[idx] = (int) (stSize * (double(rand()) / RAND_MAX));
        }
        stSize=103;
        int data[103]{1919243808,365,372,383,394,404,413,170931,667182,540028976,741551154,774455913,774778478,778265971,840969273,892416309,942815278,1292503603,1667851881,1869182049,1919243808,35,20,20,50,50,52,53,54,54,55,55,58,63,64,64,65,388,401,65,71,73,73,74,75,75,77,289,290,290,292,300,303,308,308,308,267,312,312,314,314,315,323,323,325,330,332,333,338,339,347,347,361,221,372,383,394,404,331,331,413,2,741551154,892416309,875310137,808531502,540028976,778265971,1919243808,1667851881,1869182049,774778478,1292503603,774455913,667182,356,359,361,362,363,364,365,741684787};



        std::copy(data, data + stSize, ref);
        printf("START");
        printf("\n[");
        for (size_t idx = 0; idx < stSize; ++idx){
             printf("%d,",ref[idx]);
        }
        printf("]\n");
        printf("]");

        double dSize = (stSize * sizeof(int)) / 1024 / 1024;
        printf("Sorting %zu elements of type int (%f MiB)...\n", stSize, dSize);
        gettimeofday(&t1, NULL);

        MsSerial(data, tmp, stSize);

        gettimeofday(&t2, NULL);
        etime = (t2.tv_sec - t1.tv_sec) * 1000 + (t2.tv_usec - t1.tv_usec) / 1000;
        etime = etime / 1000;

        printf("done, took %f sec. Verification...", etime);
        printf("\n[");
        for (size_t idx = 0; idx < stSize; ++idx){
             printf("%d,",ref[idx]);
        }
        printf("]\n");
        printf("\n[");
        for (size_t idx = 0; idx < stSize; ++idx){
             printf("%d,",data[idx]);
        }
        printf("]\n");
        if (isSorted(ref, data, stSize)) {
            printf(" successful.\n");
        }
        else {
            printf(" FAILED.\n");
        }
        printf("\n[");
        for (size_t idx = 0; idx < stSize; ++idx){
             printf("%d,",ref[idx]);
        }
        printf("]\n");


        free(data2);
        //delete[] data;
        free(tmp);
        //delete[] tmp;
        free(ref);
        //delete[] ref;
    }

    return EXIT_SUCCESS;
}
New2Math
  • 141
  • 7
  • Without checking all that code: "_produces values that were not in the original array that should be sorted and because of that the compiler throws an error_" smells like _Undefined Behavior_. Have you tried using more compiler/debugging help? `-g -fsanatize=address,undefined` is usually a great help. – Ted Lyngmo Jul 22 '21 at 22:18
  • Maybe this is just a bad copy/paste, but the haphazard, random indentation makes the shown code mostly unreadable. Can you try to fix the formatting, and logically indent the shown code? – Sam Varshavchik Jul 22 '21 at 22:18
  • 1
    @New2Math This if statement in a for loop if(arr[n-1] < arr[n-2]) does not make a sense. – Vlad from Moscow Jul 22 '21 at 22:23
  • @VladfromMoscow I copied the code from this site here: https://www.tutorialspoint.com/program-to-check-if-an-array-is-sorted-or-not-iterative-and-recursive-in-c – New2Math Jul 22 '21 at 22:26
  • @SamVarshavchik I will do it now – New2Math Jul 22 '21 at 22:28
  • 1
    @New2Math Such a code as below could be written by a very low-qualified programmer. for (int i = 0; i < n; ++i){ //when an array is not in sorted order if(arr[n-1] < arr[n-2]) return 0; } – Vlad from Moscow Jul 22 '21 at 22:33
  • You're going to find that the Internet's mostly crap. A huge portion of Internet tutorials, perhaps the overwhelming majority, are written by programmers so bad that they lack the tools to understand how bad they are and think they're showing off their l33t skillz. If you sit down and punch a programming question into google, odds are extremely high that you'll find several tutorials that will make you a worse programmer before you find a tutorial that actually helps you improve. The best way to protect yourself from these folks is to be able to detect them. – user4581301 Jul 22 '21 at 22:52
  • The best way to protect yourself from these folks is to be able to detect them. That requires [reading some good books](https://stackoverflow.com/questions/388242) to learn the correct terminology (vital for successful web searches) and what good code and best practices look like in action. Then when you go hunting for tutorials not covered in the books, first find out who the domain experts are and then follow their tutorials or tutorials they suggest. – user4581301 Jul 22 '21 at 22:53
  • @user4581301 thanks I appreciate the link. I wanted to buy an online course but maybe it is better to start with some books – New2Math Jul 22 '21 at 22:55
  • @TedLyngmo when I try to use those flags i.e. enter the command:g++ -o mergesort -g -fsanitize=undefined mergesort.cpp , I get the following message: mergesort.cpp: In function 'int main(int, char**)': mergesort.cpp:311:5: internal compiler error: in pp_format, at pretty-print.c:630 int main(int argc, char* argv[]) { ^~~~ mergesort.cpp:311:5: internal compiler error: Aborted g++: internal compiler error: Aborted (program cc1plus) Please submit a full bug report, with preprocessed source if appropriate. See for instructions. – New2Math Jul 22 '21 at 22:58
  • "_internal compiler error_" should as the message says be reported to the compiler vendor. You may have a bug in your program but it should **never** crash your compiler. – Ted Lyngmo Jul 22 '21 at 23:05
  • What compiler and version is that? Also what operating system? ASAN support on Windows was pretty dodgy last time I looked. – user4581301 Jul 22 '21 at 23:05
  • @user4581301 OS:Microsoft Windows 10 Home,Version:Version:10.0.19042 Build 19042,Systemname DESKTOP-8ULV0MQ,Systemproducer:ASUSTeK COMPUTER INC.,Systemmodel:ZenBook UX434IQ_UM433IQ,Systemtyp:x64-based PC,Processor:AMD Ryzen 7 4700U with Radeon Graphics,2000 MHz,8 cores,8 logical processors,g++ (MinGW.org GCC-6.3.0-1) 6.3.0 – New2Math Jul 22 '21 at 23:13
  • I don't know how the combination works but "GCC-6.3.0" is rather old. – Ted Lyngmo Jul 22 '21 at 23:15
  • The compiler still shouldn't crash, but I don't think ASAN is an option for mingw's GCC6.3. You should just get a linker error because the ASAN library is missing. – user4581301 Jul 22 '21 at 23:17
  • `binarysearchfindlowerrank(int *in,int n,int value,int projection){` looks like it's missing the return type. Ditto `binarysearchfinduperrank`. – user4581301 Jul 22 '21 at 23:20
  • *"My mergesort algorithm works correctly, i.e. it sorts the values from the input array."* -- if this is accurate, why is your merge sort part of your question? From a first read, it looks like your question is why sorting with a library function introduces garbage values. If that's the case, drop your sort code and focus on sorting with the library function. – JaMiT Jul 22 '21 at 23:22
  • 1
    Build with `-pedantic -Wall -Wextra` and you should see a few more diagnostics. There are a couple `printf` calls using the wrong format argument for the variable they're given. `double dSize = (stSize * sizeof(int)) / 1024 / 1024;` is all integer math, so the fraction you probably want since you're assigning to a `double` is lost before it gets to the `double`. – user4581301 Jul 22 '21 at 23:26
  • I tried to compile your code, but ran into errors. Please provide a [mre] that [compiles without warnings](https://stackoverflow.com/questions/57842756/why-should-i-always-enable-compiler-warnings). – JaMiT Jul 22 '21 at 23:29
  • @JaMiT I will do what you suggest – New2Math Jul 22 '21 at 23:31
  • I have identified the error: The array ref has the size of the array data2, therefore there are more values in the array ref than in the array data and therefore when sorting the array it also looks at other values – New2Math Jul 22 '21 at 23:41

1 Answers1

1

The array ref has the size of the array data2, therefore there are more values in the array ref than in the array data and therefore when sorting the array ref it also looks at other values different from the input array

New2Math
  • 141
  • 7