Iterative Merge Sort, works same speed as Bubblesort

Question

I have tried to implement an iterative Merge sort using nested loops. Although this algorithm does sort correctly (as in after sorting things are in correct order), I know there is something wrong with this implementation as I tried to sort larger collections with it and compare timings with slower sorts, and I end up getting slow times for this iterative implementation. For example, sorting 500 items gives a time of 31 milliseconds with this implementation just like bubble sort does.

  int main()
  {

      int size;
      cin >>  size;
      //assume vector is already initialized with values & size
      vector<int> items(size);
      IterativeMergeSort(items, 0, size - 1);
  }
  void IterativeMergeSort(vector<int> &items, int start, int end)
  {
    vector<int> temp(items.size());
    int left, middle, right;
    for(int outer = 1; outer < 2; outer *= 2)
    {
    for(int inner = start; inner < end; inner = inner * outer + 1)
    {
        left = outer - 1;
        middle = inner;
        right = inner + 1;
        ItMerge(items, left, middle, right, temp);
    }
   }
 }
  void ItMerge(vector<int> &items, int start, int mid, int end, vector<int> &temp)
{
int first1 = start;
int last1 = mid;
int first2 = mid + 1;
int last2 = end;

int index = first1;
while(first1 <= last1 && first2 <= last2)
{
    if(items[first1] <= items[first2])
    {
        temp[index] = items[first1];
        first1++;
    }
    else
    {
        temp[index] = items[first2];
        first2++;
    }
    index++;
}
while(first1 <= last1)
{
    temp[index] = items[first1];
    first1++;
    index++;
}
while(first2 <= last2)
{
    temp[index] = items[first2];
    first2++;
    index++;
}
for(index = start; index <= end; index++)
{
    items[index] = temp[index];
}
}

Haven't looked closely enough at your code, but for this sort of thing, 500 is not "large". Try a few 10s of thousands. — BoBTFish, Feb 27 '15 at 09:34
`for(int outer = 1; outer < 2; outer *= 2)` - This will run exactly once. Not sure what this is supposed to achieve. — Sebastian Redl, Feb 27 '15 at 17:00

score 2 · Answer 1 · answered Feb 27 '15 at 17:11

Your algorithm isn't merge sort. It tries to be, but it isn't.

As I understand it, what is supposed to happen is that the inner loop steps over subsequences and merges them, while the outer loop controls the inner loop's sequence length, starting with 1 and doubling on every iteration until there are just two subsequences and they get merged.

But that's not what your algorithm is doing. The outer loop's condition is broken, so the outer loop will run exactly once. And the inner loop doesn't take roughly-equal subsequences in pairs. Instead, the right subsequence is exactly one element (mid is inner, right is inner+1) and the left subsequence is always everything used so far (left is outer-1, and outer is constant 1). So the algorithm will repeatedly merge the already-sorted left subsequence with a single-element right subsequence.

This means that in effect, your algorithm is insertion sort, except that you don't insert in place, but instead copy the sorted sequence to a buffer, inserting the new element at the right moment, then copy the result back. So it's a very inefficient insertion sort.

score 0 · Answer 2 · edited May 23 '17 at 10:31

Below is a link to somewhat optimized examples of top down and bottom up merge sort. The bottom up merge sort is a bit faster because it skips the recursive sequence used to repeated generate sub-pairs of indexes until a sub-pair represents a run of size 1. Most of the time is spent merging, so bottom up isn't that much faster. The first pass of the bottom up merge pass could be optimized by swapping pairs in place rather than copying them. The bottom up merge sort ends up with the sorted data in either the temp or original array. If the original array is wanted, then a pass count can be calculated and if the count is odd, then the first pass swaps in place.

Both versions can sort 4 million 64 bit unsigned integers in less than a second on my system (Intel Core i7 2600k 3.4ghz).

merge_sort using vectors works well with less than 9 inputs

For a vector or array of integers, a counting / radix sort would be faster still.

score 0 · Accepted Answer · edited May 23 '17 at 11:46

I've finally figured it out.

In pseudocode:

  for( outer = 1, outer < length, outer *=2)
     for(inner = 0; inner < length, inner = inner + (outer *2))
        left = inner
        middle = (inner + outer) - 1
        right = (inner + (outer * 2)) - 1
        merge(items, left, middle, right, temp)

After rethinking how the iterative merge sort is supposed to work and looking at a couple implementations, in the merge method, all I needed was to check if the middle and right indexes passed in were greater than or equal to the vector size (that way we handle any values that could out of bounds), then merge as usual. Also, looking at this helped greatly understand it; also this. Just to be sure that it works as well as a recursive Merge Sort, I did timings on both and both (recursive and iterative) implementations produced identical times for 500,1000,5000, and 10K values to sort (in some cases the iterative solution produced a faster time).

I'd take another look - your calculations are off a little.Your 'right' can sometimes not reach the end of the vector, and sometimes can go beyond the end of the vector (eg array size of 17 it's too short and array size of 30 it's too long). — IanM_Matrix1, Mar 03 '15 at 08:14
Yes, I understand that. But as I stated before, in the merge method, I always check if the right or the middle indexes exceed the vector bounds and handle those appropriately. — Karim O., Mar 04 '15 at 01:37

Iterative Merge Sort, works same speed as Bubblesort

3 Answers3