2

SO Posts

When to use merge sort and when to use quick sort?

Quick Sort Vs Merge Sort

Wikipedia

http://en.wikipedia.org/wiki/Merge_sort

http://en.wikipedia.org/wiki/Quicksort

quick_sort is suppose to have worst case O(n^2) but merge_sort is suppose to not have a worst case and always be O (n*log N). I thought that it was dependent upon the ordering of the data set - reverse order, forward order, or random, but when I a run test...quick_sort is always faster. The code I used is below:

/*
Needs a reszie function added
*/
#include "c_arclib.cpp"
template <class T> class dynamic_array
  {
  private:
    T* array;
    T* scratch;
  public:
    int size;
    dynamic_array(int sizein)
      {
      size=sizein;
      array = new T[size]();
      }
    void print_array()
      {
      for (int i = 0; i < size; i++) cout << array[i] << endl;
      }
    void merge_recurse(int left, int right)
      {
      if(right == left + 1)
        {
        return;
        }
      else
        {
        int i = 0;
        int length = right - left;
        int midpoint_distance = length/2;
        int l = left, r = left + midpoint_distance;
        merge_recurse(left, left + midpoint_distance);
        merge_recurse(left + midpoint_distance, right);
        for(i = 0; i < length; i++)
          {
          if((l < (left + midpoint_distance)) && (r == right || array[l] > array[r]))
            {
            scratch[i] = array[l];
            l++;
            }
          else
            {
            scratch[i] = array[r];
            r++;
            }
          }
        for(i = left; i < right; i++)
          {
          array[i] = scratch[i - left];
          }
        }
      }
    int merge_sort()
      {
      scratch = new T[size]();
      if(scratch != NULL)
        {
        merge_recurse(0, size);
        return 1;
        }
      else
        {
        return 0;
        }
      }
    void quick_recurse(int left, int right) 
      {  
      int l = left, r = right, tmp;
      int pivot = array[(left + right) / 2];
      while (l <= r)
        {
        while (array[l] < pivot)l++;
        while (array[r] > pivot)r--;
        if (l <= r) 
          {
          tmp = array[l];
          array[l] = array[r];
          array[r] = tmp;
          l++;
          r--;
          }
        }
      if (left < r)quick_recurse(left, r);
      if (l < right)quick_recurse(l, right);
      }  
    void quick_sort()
      {
      quick_recurse(0,size);
      }
    void rand_to_array()
      {
      srand(time(NULL));
      int* k;
      for (k = array; k != array + size; ++k)                                             
        { 
        *k=rand();                                      
        } 
      }
    void order_to_array()
      {
      int* k;
      int i = 0;
      for (k = array; k != array + size; ++k)                                             
        { 
        *k=i;
        ++i;        
        } 
      }
    void rorder_to_array()
      {
      int* k;
      int i = size;
      for (k = array; k != array + size; ++k)                                             
        { 
        *k=i;
        --i;        
        } 
      }
  };
int main()
  {
  dynamic_array<int> d1(1000000);
  d1.order_to_array();
  clock_t time_start=clock();
  d1.merge_sort(); 
  clock_t time_end=clock();
  double result = (double)(time_end - time_start) / CLOCKS_PER_SEC; 
  cout << result;
  }
Community
  • 1
  • 1
  • 4
    And that's why the time complexity of an algorithm is calculated on the number of operations and not on the time... :) – Aurelio De Rosa Nov 02 '11 at 01:26
  • 1
    Use `std::vector`, not a custom implementation. – GManNickG Nov 02 '11 at 01:31
  • What is your point? Are you looking for data sets that make quick sort perform slower than merge sort? – President James K. Polk Nov 02 '11 at 01:33
  • 2
    Beyond algorithmic complexity, constant factors matter a lot. In CPython, [timsort](http://hg.python.org/cpython/file/b5640e74aa5c/Objects/listsort.txt) has been carefully tuned to be faster than a simple mergesort (in terms of absolute time) at most typical Python workloads, but it's still O(n log n) in the general case. – ephemient Nov 02 '11 at 02:01

3 Answers3

2

Worst case for quick sort is when the pivot element is the largest or smallest element in the array on every recursion. In that case you will have to do n-1 recursions (one of the arrays you split always only has one element) which gives you an O(n2) overall.

You can reproduce the worst case for quick sort if you use an already sorted array and pick the first or last element as pivot element.

BrokenGlass
  • 158,293
  • 28
  • 286
  • 335
  • Wow..I will have to try hard to see this n squared effect –  Nov 02 '11 at 01:36
  • Also remember that this doesn't capture the constant factor `c` - for a smaller array this might actually mask the effect since the inner loop for quick sort is faster/has less statements in most implementations than merge sort. – BrokenGlass Nov 02 '11 at 01:39
  • @ChrisAaker: It's tough to judge: sort algorithms, especially quicksort are hard to get completely right / optimize - there have been many flawed implementations in the past. Its much safer to stick with a library provided one. – BrokenGlass Nov 02 '11 at 01:46
  • There are many techniques invented over the years to prevent Quicksort from degrading to the worst case. – Mark Ransom Nov 02 '11 at 03:09
  • @ChrisAaker, you won't have to try hard to get the n^2 effect if you are working in the context of an application that regularly sorts a nearly sorted array. Java uses mergesort + insertion sort, but modifies the merge to be a no-op if the largest element in the lower array is smaller than the smallest element in the upper array (ie, it is already in order). Using insertion sort means the sort can be O(nlogn) and avoiding the merge operation (where possible) reduces that C factor that consumes time. – Tim Bender Nov 02 '11 at 03:10
  • @ChrisAaker, it's not hard to create a worst-case dataset for a __naive__ _quicksort_ implmentation. This is why, if you are using C++, to consider the library `std::sort` as your first choice for ordering an array or vector. It's typically designed to avoid degenerate behavior. The MSVC++ `std::sort` is a combination of _quicksort_ with median-of-nine partitioning, a _heap sort_ fallback when recursion exceeds some limit, and _insertion sort_ for small partitions. – Blastfurnace Nov 02 '11 at 18:49
1

Merge sort works very well for data that won't fit into memory, because each pass is linear and can be read/written to disk. Quick sort isn't even an option in that case, although the two may be combined - quick sort blocks that fit into memory, and merge sort those blocks until done.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
0

Consider the container type as well - mergesort will work much better with a linked list, because you can split the list into equal parts by just traversing it and assigning nodes to alternate sublists; rearranging things around a pivot for quicksort is considerably more involved.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
  • I would say quicksorting lists is actually easier than mergesorting them -- at each stage, just build 2 separate lists (all elements below pivot and all elements above pivot), recursively quicksort each one, then join the resulting 2 sorted lists together with the pivot in between. – j_random_hacker Nov 03 '11 at 12:24