Which of the Merge sort implementation is preferred...Why?

Question

I was implementing Merge sort in C++ and I came across an implementation on GeeksforGeeks. In my implementation, I calculated and passed the size of the array on each call. The implementation I saw on GFG was done by passing start and end indices on each call. This was the implementation on most of the websites.

Which method is preferred and why? Does the implementation on GFG better than mine.

GFG Implementation:

void merge(int arr[], int l, int m, int r) 
{ 
    int i, j, k; 
    int n1 = m - l + 1; 
    int n2 = r - m; 
  
    /* create temp arrays */
    int L[n1], R[n2]; 
  
    /* Copy data to temp arrays L[] and R[] */
    for (i = 0; i < n1; i++) 
        L[i] = arr[l + i]; 
    for (j = 0; j < n2; j++) 
        R[j] = arr[m + 1 + j]; 
  
    /* Merge the temp arrays back into arr[l..r]*/
    i = 0; // Initial index of first subarray 
    j = 0; // Initial index of second subarray 
    k = l; // Initial index of merged subarray 
    while (i < n1 && j < n2) { 
        if (L[i] <= R[j]) { 
            arr[k] = L[i]; 
            i++; 
        } 
        else { 
            arr[k] = R[j]; 
            j++; 
        } 
        k++; 
    } 
  
    /* Copy the remaining elements of L[], if there 
       are any */
    while (i < n1) { 
        arr[k] = L[i]; 
        i++; 
        k++; 
    } 
  
    /* Copy the remaining elements of R[], if there 
       are any */
    while (j < n2) { 
        arr[k] = R[j]; 
        j++; 
        k++; 
    } 
} 
  
/* l is for left index and r is right index of the 
   sub-array of arr to be sorted */
void mergeSort(int arr[], int l, int r) 
{ 
    if (l < r) { 
        // Same as (l+r)/2, but avoids overflow for 
        // large l and h 
        int m = l + (r - l) / 2; 
  
        // Sort first and second halves 
        mergeSort(arr, l, m); 
        mergeSort(arr, m + 1, r); 
  
        merge(arr, l, m, r); 
    } 
}

My Implementation:

void Merge(int *L, int *R, int nL, int nR, int *A)
{
    int i, j, k;
    i = j = k = 0;
    while (i < nL && j < nR)
    {
        if (L[i] <= R[j])
        {
            A[k] = L[i];
            i++;
        }
        else
        {
            A[k] = R[j];
            j++;    
        }
        k++;
    }
    while (i < nL)
    {
        A[k] = L[i];
        i++;    
        k++;    
    }
    while (j < nR)
    {
        A[k] = R[j];
        j++;        
        k++;
    }
}

void MergeSort(int *A, int n)
{
    int mid = n / 2;
    if (n < 2)
    {
        return;
    }
    int left[mid], right[n - mid]; 
    for (int i = 0; i < mid; i++)
    {
        left[i] = A[i];
    }
    for (int i = mid; i < n ;i++)
    {
        right[i-mid] = A[i];
    }
    int nL = sizeof(left) / sizeof(int);
    int nR = sizeof(right) / sizeof(int);
    MergeSort(left, nL);
    MergeSort(right, nR);
    Merge(left, right, nL, nR, A);
}

the implementation from geeks is non-standard C++ (as usual), thats certainly not prefered — 463035818_is_not_an_ai, Sep 30 '20 at 08:34
btw yours is neither. Read this: [Why aren't variable-length arrays part of the C++ standard?](https://stackoverflow.com/questions/1887097/why-arent-variable-length-arrays-part-of-the-c-standard) — 463035818_is_not_an_ai, Sep 30 '20 at 08:35
Yes, I can use vector instead of an array. If I use vector, vector.size() will give out the length. My question is, is it necessary to pass start and end indices like the GeeksForGeeks implementation is there any perks of passing start and end idices. — Muhamed Suhail, Sep 30 '20 at 09:20
the question you did ask is "Which method is preferred and why?" and I would reject both of them because I cannot compile them — 463035818_is_not_an_ai, Sep 30 '20 at 09:21
@idclev463035818 - std::sort isn't stable, merge sort is, so std::stable_sort. — rcgldr, Sep 30 '20 at 15:37
Preferred for what purpose, academic or actual usage? For actual usage, most libraries use some variation of a hybrid insertion sort and bottom up merge sort, with a one time allocation of the temporary array. Top down merge sort is mostly academic. — rcgldr, Sep 30 '20 at 15:39

chqrlie · Accepted Answer · 2020-10-03T11:10:14.820

There are 2 main differences between these implementations:

in the first merge function the r index is included in the slice, whereas in Merge you pass different arrays, each with its size. Passing the index to the last element is not idiomatic in C and C++ as it requires error prone +1/-1 adjustments and does not allow to specify empty slices especially if the index type is unsigned.
in the first implementation, the temporary arrays are allocated as VLAs, which by the way are not standard in C++ and only optional in C, in the merge function, so the maximum amount of temporary storage used during the sorting process is the same size as the original array (for the last merge). In the second implementation, temporary storage is allocated in the recursive MergeSort function, so the amount of memory used at any given point is larger, about twice the size of the original array.

An alternative more conservative approach would use the included/excluded convention and only allocate temporary storage for the left slice in the merge function, reducing the temporary storage requirements to 50% of the size of the original array.

void merge(int arr[], int l, int m, int r) { 
    int i, j, k; 
    int n1 = m - l; 
  
    /* create temp arrays: should allocate from the heap for large arrays */
    int L[n1]; 
  
    /* Copy data to temp arrays L[] and R[] */
    for (i = 0; i < n1; i++) {
        L[i] = arr[l + i];
    }

    /* Merge the temp arrays back into arr[l...r]*/
    i = 0; // Initial index of first subarray 
    j = l; // Initial index of second subarray 
    k = l; // Initial index of merged subarray 
    while (i < n1 && j < r) { 
        if (L[i] <= arr[j]) { 
            arr[k++] = L[i++]; 
        } else { 
            arr[k++] = arr[j++]; 
        } 
    } 
  
    /* Copy the remaining elements of L[], if there are any */
    while (i < n1) {
        arr[k++] = L[i++];
    }
  
    /* No need to copy the remaining elements of the right slice */
}
  
/* l is for left index and r is the index of the first element beyond the 
   sub-array of arr to be sorted */
void mergeSort(int arr[], int l, int r) { 
    if (r - l >= 2) { 
        // Same as (l+r)/2, but avoids overflow for large l and h 
        int m = l + (r - l) / 2; 
  
        // Sort first and second halves 
        mergeSort(arr, l, m); 
        mergeSort(arr, m, r); 
        merge(arr, l, m, r); 
    } 
}

void arraySort(int arr[], int length) {
    mergeSort(arr, 0, length);
}

Which of the Merge sort implementation is preferred...Why?

1 Answers1