6

I'm trying to implement merge sort but somehow have been stuck for a day.

[Sorry for pasting a big amount of code but wouldn't be able to do without it]

Implementation:

Merge Sort - Function

int mergeSort(int arr[], int low, int high)
{
    int half = (low + high) / 2;    /* Find the middle element */ 
    if (low < high) {
        if (high - low == 1)        /* If only one element, return */
            return;
        mergeSort(arr, low, half);  /* Sort First Half */
        mergeSort(arr, half, high); /* Sort Second Half */
        merge(arr, low, half, high); /* Merge */
    }
    return SUCCESS;
}

Merging Step - Merge Function

int merge(int arr[], int low, int half, int high)
{
    int i = 0, j = 0, k = 0, l = 0, temp;    /* Initialize indices */

    for (i = low, j = half; i < half && j < high; i++, j++) { 
        /* Swap in the array itself if the elements are out of order */
        if (arr[i] > arr[j]) {
            temp = arr[i];
            arr[i] = arr[j];
            arr[j] = temp;
        }
    }

    i = 0, j = low;  /* Compare and copy the elements in a global arrray C */
    for (k = 0; k < SIZE && i < low && j < high; k++) {
        if (arr[i] < arr[j]) {
            c[k] = arr[i];
            i++;
        } else {
            c[k] = arr[j];
            j++;
        }
    }
    
    if (i < j) { /* Copy remaining elements to the end of the array C */
        for (l = i; l < low; l++) {
            c[k++] = arr[l];
        }
    } else {
        for (l = j; l < high; l++) {
            c[k++] = arr[l];
        }
    }

Output

8 --> 9 --> 
4 --> 5 --> 8 --> 9 --> 
4 --> 5 --> 8 --> 9 --> 
1 --> 2 --> 4 --> 5 --> 8 --> 9 --> /* Sorting when left is sorted but right is in the process ? */
4 --> 5 --> 6 --> 7 --> 8 --> 9 --> 1 --> 2 --> 
1 --> 2 --> 4 --> 5 --> 6 --> 7 --> 8 --> 9 --> 
1 --> 2 --> 6 --> 7 --> 4 --> 5 --> 8 --> 9 --> 

     

Problem Description

I'm not using any local array for storing the elements of the array. Instead the elements are swapped if they are out of order and then copied into a global array by comparison Example: If I have the array

{ 9, 8, 4, 5, 2, 1, 7, 6 }

Then, firstly {8,9} would be sorted and then {4,5} would be sorted and then when copying procedure i.e {8,4} would be compared and so on. The following recursive calls take place

mergeSort(0,8) -> mergeSort(0,4) , mergeSort(4,8), merge(0,4,8)
mergeSort(0,4) -> mergeSort(0,2) , mergeSort(2,4), merge(0,2,4)
mergeSort(0,2) -> 1 element calls
and so on.

When merge is called when {4,5,8,9} are sorted , and the right is called merge(4,5,6) I have the following array

{4,5,8,9,1,2,7,6}

So, the algo tries to sort {4,5,8,9} and {1,2} when {1,2} is not a part of this subproblem i.e I want {1,2} and '{6,7} to become {1,2,6,7} and then combine both of these.

How do I overcome this? I have been stuck for so many hours.

Thanks a lot

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Hooli
  • 711
  • 2
  • 13
  • 24
  • 1
    related: http://stackoverflow.com/q/2126219/951890 – Vaughn Cato Sep 03 '13 at 20:57
  • 1
    It's not that simple. Take a look at a working implementation [here](http://thomas.baudel.name/Visualisation/VisuTri/inplacestablesort.html). – Sergey Kalinichenko Sep 03 '13 at 21:00
  • What exactly is the question here? Do you have a known-good algorithm for in-place merge, but your implementation is wrong? Or are you looking for a working algorithm? – Oliver Charlesworth Sep 03 '13 at 21:00
  • My implementation seems to be faulty. I'm just not using temporary arrays. Just using indices to copy into global array.I think the line 'i = 0, j = low' is creating a problem when I'm comparing the different parts if the original array that is swapped. – Hooli Sep 03 '13 at 21:02
  • Ok, well then this is where debugging comes into play. Just step through your code (or add print statements), and compare its behaviour to what you expect should happen. At some point they will diverge, and then you've found your bug. – Oliver Charlesworth Sep 03 '13 at 21:04
  • I have tried hard for that before posting. Actually, when the left array is sorted till 4 elements, the right array starts sorting but before it sorts itself fully, it becomes a part of the total problem not 8->4->2->1 but 6->4>2->1 that is 6 elements are attempted to being sorted. Any insights would be greatly helpful – Hooli Sep 03 '13 at 21:07
  • Like I said, step through this line by line, and compare the values of intermediate variables, etc. to the expected behaviour (i.e. figure it out on paper first). At some point, these **must** differ. (This is much more productive than asking randoms on Stack Overflow to debug your code for you ;) ) – Oliver Charlesworth Sep 03 '13 at 21:17
  • With in-place merge sort, you end up having a much higher time complexity -- from O(nlogn) to O(n*n). I am hoping you are aware of that. – Manoj Pandey Sep 03 '13 at 21:25
  • It sounds trivial, but looks can be deceiving. Several approaches have been published, [this being one of the more comprehendible](http://akira.ruc.dk/~keld/teaching/algoritmedesign_f04/Artikler/04/Huang88.pdf). And @ManojPandey, that is not the case if implemented properly. More tedious than tmp-storage mergesort, certainly. relegated to no-better-than-bubblesort? Not hardly. – WhozCraig Sep 03 '13 at 22:10
  • I understood it and realized why inplace mergesort is so difficult to implement. Did it with my own implementation by copying the temporary array into the original which was the main cause of the problem. Thank you so much for your answers. – Hooli Sep 04 '13 at 00:30
  • 1
    at second glance, this is *not* an attempt to implement merge sort with (much) less additional memory than O(n). It is just fallaciously using `a global array`, open coded array copies, … – greybeard Nov 25 '17 at 10:40
  • @ManojPandey: techniques are known to perform in-place merging of two sequences in linear time. This allows a true in-place MergeSort in linlog time. –  Jun 29 '22 at 13:31

2 Answers2

0

In-place MergeSort is rarely, if ever used, because an in-place Merge operation, though possible, is quite complicated compared to the obvious version using an extra buffer.

Though rarely suggested, an efficient version can be obtained by alternating the merges between the original and extra buffers, to minimize the number of moves. Care must be taken that an even number of merges be made across the recursive calls. This can be handled by implementing true in-place merges for very small number of elements (or alternative sorts).

0

Your algorithm is flawed:

  • the mergeSort function cannot be called with an empty slice: if (high - low == 1) should be changed to if (high - low < 2). As coded, an empty slice (low == high) will cause an infinite recursion triggering a stack overflow.
  • the first loop in the merge function is incorrect: swapping the elements pairwise in the initial loop may cause the halves to become unsorted: merge({1, 4, 1, 2, 3}, 0, 2, 5) changes the halves to {1, 2} and {1, 4, 3}, which is no longer sorted.
  • the merge loop into the global array c stops at SIZE elements (probably the length of c) so slices larger than SIZE are not handled.
  • furthermore, you do not test this limit in the subsequent loops, potentially causing a buffer overflow for large slices.
  • the test if (i < j) is incorrect: it should be if (i < half). As coded, the remaining elements in the second half are only copied if they are all greater than the first half.
  • you never copy back from the global array to the original slice, hence the merge is ineffective.
  • Your claim Merging in merge sort without using temporary arrays is unwarranted: the global array c is a temporary array.

Implementing merge sort in place without temporary arrays is difficult and less efficient than other sorting methods. There is no need for 2 temporary arrays, one with a length of half - low suffices, but allocating it with automatic storage is risky for large sorts and allocating from the heap is slow if performed at each step. Efficient and safe implementations allocate a single temporary array of length len / 2 at the start of the sort, passing it recursively along the steps and using it in the merge phases. At the cost of twice the space, merge sort can be implemented by alternating source and destination arrays when recursing.

Merge sort can be implemented simply with a single temporary array of limited size, increasing the complexity to O(n2) but only for very large arrays and a slow degradation of performance as n increases.

chqrlie
  • 131,814
  • 10
  • 121
  • 189