0

I created on jsperf.com test for 3 sorting methods: Bubble, Insertion and Merge. Link

Before test I create unsorted array with random number from 0 to 1Mln. Each time test shows that Insertion sort faster than Merge one. What's reason for such result, if Merge sort time O(n log(n)) while Insertion and Bubble sorts have O(n^2) test result here

Malik Nur
  • 1
  • 1
  • Interesting. I don't see anything immediately obvious, but all can be optimized, improving their times. Your merge sort is not in place, thus requiring millions of allocations, so it's not shocking that that one is slow. – Mooing Duck Nov 02 '15 at 05:38
  • So you mean that in this specific case, sorting array of numbers, Merge sort is not proper solution? – Malik Nur Nov 02 '15 at 05:44
  • 1
    Merge sort is a good solution; it's just harder to implement optimally than insertion sort. – Amadan Nov 02 '15 at 05:52

2 Answers2

0

Without more testing, a tentative answer:

Your insertion sort is fairly optimised - you are only switching elements. Your merge sort instantiates new arrays using [], and creates new arrays using slice and concat, which is a large memory-management overhead, not to mention that concat and slice have implicit loops inside them (although in native code). Merge sort is efficient when it is done in-place; with all the copying going on, that should slow you down a lot.

Amadan
  • 191,408
  • 23
  • 240
  • 301
  • Maybe I need to rewrite Merge sort without concat and slice methods? – Malik Nur Nov 02 '15 at 05:47
  • @MalikNur: And probably reusing the same few arrays each time. I think the memory cache is what's killing it. – Mooing Duck Nov 02 '15 at 05:48
  • 1
    Yes. http://stackoverflow.com/questions/2571049/how-to-sort-in-place-using-the-merge-sort-algorithm has good discussion, and links to papers describing in-place merge sort algorithms. The best one, AFAIK, is to have two arrays of the same size, where one becomes the working area to avoid excess swappage. – Amadan Nov 02 '15 at 05:51
0

As commented by Amadan, it would be best for merge sort to do a one time allocation of the same size as the array to be sorted. Top down merge sort uses recursion to generate the indices used by merge, while bottom up skips the recursion and use iteration to generate the indices. Most of the time will be spent doing the actual merging of sub-arrays, so top down's excess overhead on larger arrays (1 million element or more) is only about 5%.

Example C++ code for a somewhat optimized bottom up merge sort.

void MergeSort(int a[], size_t n)           // entry function
{
    if(n < 2)                               // if size < 2 return
        return;
    int *b = new int[n];
    BottomUpMergeSort(a, b, n);
    delete[] b;
}

size_t GetPassCount(size_t n)               // return # passes
{
    size_t i = 0;
    for(size_t s = 1; s < n; s <<= 1)
        i += 1;
    return(i);
}

void BottomUpMergeSort(int a[], int b[], size_t n)
{
size_t s = 1;                               // run size 
    if(GetPassCount(n) & 1){                // if odd number of passes
        for(s = 1; s < n; s += 2)           // swap in place for 1st pass
            if(a[s] < a[s-1])
                std::swap(a[s], a[s-1]);
        s = 2;
    }
    while(s < n){                           // while not done
        size_t ee = 0;                      // reset end index
        while(ee < n){                      // merge pairs of runs
            size_t ll = ee;                 // ll = start of left  run
            size_t rr = ll+s;               // rr = start of right run
            if(rr >= n){                    // if only left run
                rr = n;
                BottomUpCopy(a, b, ll, rr); //   copy left run
                break;                      //   end of pass
            }
            ee = rr+s;                      // ee = end of right run
            if(ee > n)
                ee = n;
            // merge a pair of runs
            BottomUpMerge(a, b, ll, rr, ee);
        }
        std::swap(a, b);                    // swap a and b
        s <<= 1;                            // double the run size
    }
}

void BottomUpCopy(int a[], int b[], size_t ll, size_t rr)
{
    while(ll < rr){                         // copy left run
        b[ll] = a[ll];
        ll++;
    }
}

void BottomUpMerge(int a[], int b[], size_t ll, size_t rr, size_t ee)
{
    size_t o = ll;                          // b[]       index
    size_t l = ll;                          // a[] left  index
    size_t r = rr;                          // a[] right index
    while(1){                               // merge data
        if(a[l] <= a[r]){                   // if a[l] <= a[r]
            b[o++] = a[l++];                //   copy a[l]
            if(l < rr)                      //   if not end of left run
                continue;                   //     continue (back to while)
            while(r < ee)                   //   else copy rest of right run
                b[o++] = a[r++];
            break;                          //     and return
        } else {                            // else a[l] > a[r]
            b[o++] = a[r++];                //   copy a[r]
            if(r < ee)                      //   if not end of right run
                continue;                   //     continue (back to while)
            while(l < rr)                   //   else copy rest of left run
                b[o++] = a[l++];
            break;                          //     and return
        }
    }
}
rcgldr
  • 27,407
  • 3
  • 36
  • 61