5

why we are always using quick sort ? or any specific sorting algorithm ??

i tried some experiment on my PC using quick,merge,heap,flash sort

results:-

sorting algorithm : time in nanosecond -> time in minutes

quick sort time : 135057597441 -> 2.25095995735

Flash sort time : 137704213630 -> 2.29507022716667

merge sort time : 138317794813 -> 2.30529658021667

heap sort time : 148662032992 -> 2.47770054986667

using java in built function

long startTime = System.nanoTime();

given times are in nanoseconds there hardly any difference between them if we convert them into seconds for 20000000 random integer data and max array size is 2147483647 in java.if we are using in-place algorithm then there may be difference of 1 to 2 min till max array size.

if the difference is too small why we should care ??

asd
  • 215
  • 3
  • 9
  • 1
    well, for starters there is the difference about [sort stability](http://stackoverflow.com/questions/1517793/stability-in-sorting-algorithms). – Bartlomiej Lewandowski Jan 27 '14 at 20:15
  • 7
    That is *one* benchmark of *one* implementation. Also, "we" don't "always" use Quicksort; for example the standard sorting algorithm in Python is Timsort, a highly modified mergesort. –  Jan 27 '14 at 20:20
  • 3
    Mergsort [variants] seem much more common than quicksort in standard libraries/APIs. I left quicksort in school - and much prefer the consistent guarantees and stability of a mergesort (hint: what is the *worse* case for mergesort vs quicksort?). Also, check out comb-sort. – user2864740 Jan 27 '14 at 20:47
  • what was the range of the integers? – Karoly Horvath Jan 27 '14 at 21:14

4 Answers4

8

All of the algorithms presented have a similar average case bounds, of O(n lg n), which is the "best" a comparison sort can do.

Since they share the same average bounds, the expected performance of these algorithms over random data should be similar - which is what the findings show. However, the devil is in the details. Here is a very quick summary; follow the links for further details.

Quicksort is generally not stable (but there are stable variations). While quicksort has an average bounds of O(n lg n), Quicksort has a worst case bounds of O(n * n) but there are ways to mitigate this. Quicksort, like heapsort, is done in-place.

Merge-sort is a stable sort. Mergesort has a worst case bounds of O(n lg n) which means it has predictable performance. Base merge-sort requires O(n) extra space so it's generally not an in-place sort (although there is an in-place variant, and the memory for a linked list implementation is constant).

Heapsort is not stable; it also has the worst case bounds of O(n lg n), but has the benefit of a constant size bounds and being in-place. It has worse cache and parallelism aspects than merge-sort.

Exactly which one is "best" depends upon the use-case, data, and exact implementation/variant.


Merge-sort (or hybrid such as Timsort) is the "default" sort implementation in many libraries/languages. A common Quicksort-based hybrid, Introsort is used in several C++ implementations. Vanilla/plain Quicksort implementations, should they be provided, are usually secondary implementations.

Merge-sort: a stable sort with consistent performance and acceptable memory bounds.

Quicksort/heapsort: trivially work in-place and [effectively] don't require additional memory.

user2864740
  • 60,010
  • 15
  • 145
  • 220
2

We rarely need to sort integer data. One of the biggest overheads on a sort is the time it takes to make comparisons. Quicksort reduces the number of comparisons required by comparison with, say, a bubble sort. If you're sorting strings this is much more significant. As a real world example some years ago I wrote a sort/merge that took 40 minutes with a bubble sort, and 17 with a quick sort. (It was a z80 CPU a long time ago. I'd expect much better performance now).

  • 4
    Increase the data-size and a bubble vs. a sort with better bounds still matters. Thus the very basis of this answer is misleading. Also this question asks about different sort algorithms *with similar performance bounds* (and why/when one is preferable over the other even though they all appear "the same speed"), and not a degenerate sorting algorithm. – user2864740 Jan 27 '14 at 20:49
  • Obama also thinks bubble sort is degenerate: https://www.youtube.com/watch?v=koMpGeZpu4Q – ThinkBonobo Apr 21 '21 at 19:50
1

Your conclusion is correct: most people that do care about this in most situations waste their time. Differences between these algorithms in terms of time and memory complexity become significant in a particular scenarios where:

  • you have huge number of elements to sort

  • performance is really critical (for example: real-time systems)

  • resources are really limited (for example: embedded systems)

(please note the really)

Also, there is the concern of stability which may be important more often. Most standard libraries provide stable sort algorithms (for example: OrderBy in C#, std::stable_sort in C++, sort in Python, sort methods in Java).

BartoszKP
  • 34,786
  • 15
  • 102
  • 130
  • there could also be a huge difference if data set is mostly sorted. – Karoly Horvath Jan 27 '14 at 21:13
  • @KarolyHorvath If none of the conditions I gave is met it would be also completely irrelevant. In most applications there are usually hundreds of other things to improve/optimize, and sorting an n-th element list when n is small is last of them, regardless of if it's mostly sorted or not. And even more - when you have large number of records you typically use a DB, and you sort the data via SQL. Still, you optimize the number of roundtrips to the DB then, not the sorting algorithm... – BartoszKP Jan 27 '14 at 21:17
  • ok, then here's a better one: when your system is memory constrained (eg: embedded system) - you probably want an in place sort. – Karoly Horvath Jan 27 '14 at 21:22
  • @KarolyHorvath Good point. I had one for time complexity, but I omitted memory complexity. I've added this to the list. – BartoszKP Jan 27 '14 at 21:25
0

Correctness. While switching between sort algorithms might offer speed-ups under some specific scenarios, the cost of proving that algorithms work can be quite high.

For instance, TimSort, a popular sorting algorithm used by Android, Java, and Python, had an implementation bug that went unnoticed for years. This bug could cause a crash and was easily induced by the user.

It took a dedicated team "looking for a challenge" to isolate and solve the issue.

For this reason, any time a standard implementation of a data structure or algorithm is available, I will use that standard implementation. The time saved by using a smarter implementation is rarely worth uncertainty about the implementation's security and correctness.

Richard
  • 56,349
  • 34
  • 180
  • 251