23

Why? Is it faster or more efficient?

For systems with one core, we can use quicksort. What should we use on systems with two cores, four cores, or eight cores?

Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
mr. Vachovsky
  • 1,128
  • 2
  • 13
  • 23
  • 15
    Arrays.sort() only uses quicksort on primitive types. For Object[] or collections (Collections.sort()) merge sort is used. – Björn Nov 29 '10 at 15:14
  • 4
    ... because it's stable, whereas quicksort isn't. For primitive types it doesn't matter whether a sort is stable or not, because there are no values which compare unequal but have the same sort order. Floating point types raise some complication there, but not enough to require merge sort. – Steve Jessop Nov 29 '10 at 15:18
  • 4
    besides the reasons below, quicksort, being a divide & conquer algorithm, also exhibits good cache behaviour. – lijie Nov 29 '10 at 15:20
  • If you see the Arrays.java class, you can see that for a small array (length < 7) they use Insertion Sort, otherwise Quick Sort. – Naren Feb 23 '15 at 22:41
  • 2
    [They are similar questions, could be reference][1] [1]: http://stackoverflow.com/questions/3707190/why-java-arrays-use-two-different-sort-algorithms-for-different-types – Erik Johnsson Mar 09 '15 at 21:59

12 Answers12

36

Quicksort has the advantage of being completely in place, so it does not require any additional storage, while mergesort (which is actually used by Arrays.sort() for object arrays) and other (all?) guaranteed O(n*log n) algorithm require at least one full copy of the array. For programs that sort very large primitive arrays, that means potentially doubling the overall memory usage.

Sean Patrick Floyd
  • 292,901
  • 67
  • 465
  • 588
Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720
  • +1, for both the memory point, and the primitive vs object difference – Bozho Nov 29 '10 at 15:24
  • 6
    Heapsort is an in-place sort with worst-case O(n log n). (As are variants and hybrids like smoothsort and introsort.) – Gareth Rees Nov 29 '10 at 16:26
  • Isn't only the copy of half of the array required in mergesort? – Maciej Hehl Nov 29 '10 at 18:12
  • Next, you might want to see https://stackoverflow.com/questions/3707190/why-does-javas-arrays-sort-method-use-two-different-sorting-algorithms-for-diff – Jack Jun 14 '20 at 05:51
20

The answer is in Jon L. Bentley and M. Douglas McIlroy’s “Engineering a Sort Function”, which the sort function cites.

Shopping around for a better qsort, we found that a qsort written at Berkeley in 1983 would consume quadratic time on arrays that contain a few elements repeated many times—in particular arrays of random zeros and ones. In fact, among a dozen different Unix libraries we found no qsort that could not easily be driven to quadratic behavior; all were derived from the Seventh Edition or from the 1983 Berkeley function.…

Unable to find a good enough qsort, we set out to build a better one. The algorithm should avoid extreme slowdowns on reasonable inputs, and should be fast on ‘random’ inputs. It should also be efficient in data space and code space. The sort need not be stable; its specification does not promise to preserve the order of equal elements.

The alternatives were heapsort and mergesort, since Java was created in the early 1990s. Mergesort is less desirable because it requires extra storage space. Heapsort has a better worst-case performance (O(n log n) compared to O(n^2)), but performs more slowly in practice. Thus, if you can control the worst case performance via good heuristics, a tuned quicksort is the way to go.

Java 7 is switching to Timsort, which was invented in 1993 (implemented in Python in 2002) and has a worst-case performance of O(n log n) and is a stable sort.

Community
  • 1
  • 1
Josh Lee
  • 171,072
  • 38
  • 269
  • 275
14

Quicksort has O(n log n) average and O(n^2) worst case performance, that is the best "average case" a sort algorithm can be, there are other sort algorithms that have this performance, but quicksort tends to perform better than most.

See: http://en.wikipedia.org/wiki/Quicksort

Orbling
  • 20,413
  • 3
  • 53
  • 64
  • 14
    average case O(n log n) is _not_ the best performance a sort can have. worst case O(n log n) _is_ the best performance a _comparison-based_ sort can have. – lijie Nov 29 '10 at 15:18
  • 1
    That is only the best that comparison based sorts can be. You can sort faster if you can avoid comparison based sorting. – Nick Larsen Nov 29 '10 at 15:18
  • 1
    @lijie Sorry that was what I meant to say, not enough sleep LOL. I'll amend. – Orbling Nov 29 '10 at 15:23
  • It's okie, but the correction is not correct. quicksort does _not_ have O(nlogn) worst case performance; its worst case performance is O(n^2). And Arrays.sort is a comparison based sort, but when statements like "the best possible worst case sort performance is O(nlogn)" are made then it causes people to forget that there are other sorting paradigms available. – lijie Nov 29 '10 at 15:29
  • Well comparison sorts are the type usually used, a radix or bucket sort is a different kettle of fish. – Orbling Nov 29 '10 at 15:36
10

It is a tuned quicksort. If you are really interested you can read the material mentioned in the documentation.

The sorting algorithm is a tuned quicksort, adapted from Jon L. Bentley and M. Douglas McIlroy's "Engineering a Sort Function", Software-Practice and Experience, Vol. 23(11) P. 1249-1265 (November 1993).

And here is a bit of an explanation - the tuned version gives n*log(n) on many data sets:

This algorithm offers n*log(n) performance on many data sets that cause other quicksorts to degrade to quadratic performance

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
3

Arrays.sort() uses multiple sorting algorithms depending on the size and elements in the array.

  • Insertion sort for small arrays
  • Merge sort for mostly sorted arrays
  • A highly tuned and adaptable dual-pivot & single pivot quicksort for everything else

So in practice we see that quicksort is very fast for large arrays of primitives but has some pitfalls when it needs to adapt to partially sorted arrays, when comparisons between objects are slow, for stable sorting and more.

David McManamon
  • 385
  • 2
  • 10
3

Compared with Quicksort, Mergesort has less number of comparisons but larger number of moving elements.

In Java, an element comparison is expensive but moving elements is cheap. Therefore, Mergesort is used in the standard Java library for generic sorting

In C++, copying objects can be expensive while comparing objects often is relatively cheap. Therefore, quicksort is the sorting routine commonly used in C++ libraries.

ref: http://www.cs.txstate.edu/~rp44/cs3358_092/Lectures/qsort.ppt

Cem
  • 504
  • 5
  • 4
2

Since its been a while since last answer on this thread, here is some updates...

It depend on the complexity and its relevancy to size of array plus probability when java researched these algorithms and just decided depending on Measurements and Benchmarks.

According to JAVA JDK 1.8 DOCS its self explanatory where it chooses the algorithm not only one but up to four to choose from according to some thresholds...

/**
     * If the length of an array to be sorted is less than this
     * constant, Quicksort is used in preference to merge sort.
     */
    private static final int QUICKSORT_THRESHOLD = 286;

    /**
     * If the length of an array to be sorted is less than this
     * constant, insertion sort is used in preference to Quicksort.
     */
    private static final int INSERTION_SORT_THRESHOLD = 47;

    /**
     * If the length of a byte array to be sorted is greater than this
     * constant, counting sort is used in preference to insertion sort.
     */
    private static final int COUNTING_SORT_THRESHOLD_FOR_BYTE = 29;

    /**
     * If the length of a short or char array to be sorted is greater
     * than this constant, counting sort is used in preference to Quicksort.
     */
    private static final int COUNTING_SORT_THRESHOLD_FOR_SHORT_OR_CHAR = 3200;

Reference Java DOC JDK 8

It event evolved to use parallel Sorting Sorting in Java

Java 8 comes with a new API – parallelSort – with a similar signature to the Arrays.sort() API:

@Test
public void givenIntArray_whenUsingParallelSort_thenArraySorted() {
    Arrays.parallelSort(toSort);

    assertTrue(Arrays.equals(toSort, sortedInts));
}

Behind the scenes of parallelSort(), it breaks the array into different sub-arrays (as per granularity in the algorithm of parallelSort). Each sub-array is sorted with Arrays.sort() in different threads so that sort can be executed in parallel fashion and are merged finally as a sorted array.

Note that the ForJoin common pool is used for executing these parallel tasks and then merging the results.

The result of the Arrays.parallelSort is going to be same as Array.sort of course, it’s just a matter of leveraging multi-threading.

Finally, there are similar variants of API Arrays.sort in Arrays.parallelSort as well:

Arrays.parallelSort (int [] a, int fromIndex, int toIndex);

Summary: So as the Java API evolves with the HardWare and software in general there is more use for the multi threading and tuning here and there on the thresholds and algorithims.

shareef
  • 9,255
  • 13
  • 58
  • 89
1

First of all Arrays.sort doesn't only use quick sort , It uses multiple algorithms java1.6 onwards

See below code from Arrays class

/** * Sorts the specified array into ascending numerical order. * *

Implementation note: The sorting algorithm is a Dual-Pivot Quicksort * by Vladimir Yaroslavskiy, Jon Bentley, and Joshua Bloch. This algorithm * offers O(n log(n)) performance on many data sets that cause other * quicksorts to degrade to quadratic performance, and is typically * faster than traditional (one-pivot) Quicksort implementations. * * @param a the array to be sorted */ public static void sort(int[] a) { DualPivotQuicksort.sort(a); }

DualPivotQuicksort.sort(a); // This uses 5 algorithms internally depending upon dataset size 
do checkout the source code of Arrays class.

Before java 1.6 I think it was using three algorithm quick sort for primitive types such as int and mergesort for objects and when quick sort out performs it start heap sort, See here for more details http://cafe.elharo.com/programming/java-programming/why-java-util-arrays-uses-two-sorting-algorithms

pyshcoguy
  • 11
  • 2
0

Quicksort is fastest on average O(n log(n)), so Sun probably used that as a good metric.

darioo
  • 46,442
  • 10
  • 75
  • 103
  • Lots of sort algorithms are O(n log n). – Paul Tomblin Nov 29 '10 at 15:11
  • @Paul: true, but since quicksort has been around for about 50 years, it's well understood and that might have been one reason for using it. – darioo Nov 29 '10 at 15:13
  • 1
    by that reasoning i'm sure insertion sort should be preferred (it probably has been around for more years than quicksort and even more well understood) – lijie Nov 29 '10 at 15:16
  • @lijie: except its complexity is `O(n^2)` on average. – darioo Nov 29 '10 at 15:20
  • yes that wasn't really a serious comment. It was just to point out that the reason given (it is old and understood) is probably not an important consideration at all... in fact among the more popularly known sorts, i think quicksort is relatively "recent"... – lijie Nov 29 '10 at 15:25
0

QuickSort is a common sorting algorithm. It's reasonably fast, except when the data to be sorted is already in inverse order. It's also efficient in space.

Paul Tomblin
  • 179,021
  • 58
  • 319
  • 408
  • 1
    Surely `Arrays.sort` copes with reverse order without going O(n^2). So to the extent that it "is quicksort" (the premise of the question), "quicksort" doesn't mean, "the dumbest quicksort you could possibly write" ;-) – Steve Jessop Nov 29 '10 at 15:15
  • Well, yeah, it's a "Tuned Quicksort", so it's not really QuickSort. – Paul Tomblin Nov 29 '10 at 15:16
  • 1
    there is no predefined worst case for quicksort, if one does not specify the pivot selection. so the claim about it performing worse when the data is reverse ordered is not right. – lijie Nov 29 '10 at 15:17
  • @lijie: that's just a name game, though - does "quicksort" mean, "the algorithm presented by Hoare", or does it mean, "any algorithm which looks a bit like the algorithm presented by Hoare"? I think Paul refers to this difference in his comment above, although it's incorrect to say "QuickSort is a common sorting algorithm" while also taking "QuickSort" to mean "original QuickSort with the pivot as the left-most element". – Steve Jessop Nov 29 '10 at 17:16
  • @Steve: well, yes i guess there are way too many variants. but the original one performs essentially as badly with a sorted list as well as a reverse-sorted list. but unless pivot selection is smart (e.g. the one used for a linear quickselect), avoiding O(n^2) is hard. – lijie Nov 30 '10 at 04:40
  • @lijie: yes, to avoid O(n^2) you need to ensure that the pivot has a minimum *fraction* of the elements either side of it. There's no way other than smart pivot selection to avoid quadratic worst case whilst sticking with the basic quicksort partitioning mechanism, although you can also make it unfeasibly unlikely with a good random pivot selection. – Steve Jessop Nov 30 '10 at 10:21
0

It depends on what you want to do. The problem with a normal quicksort is, that it can sometimes be in O(n²). So normaly you could use heap sort, but most times quick sort is faster.

However the Arrays.sort(...) implementation uses a "tuned tuned quicksort, adapted from Jon L. Bentley and M. Douglas McIlroy [...]" (according to the JavaDoc documentation). This algorithm has has some build in optimizations, that enables it to work on O(n*log(n)), where a normal quicksort would use O(n²).

Also the Arrays.sort algorithm is tested over and over again and you can be sure that it works and is bugfree (although this can't be guaranteed.)

iuiz

iuiz
  • 957
  • 1
  • 10
  • 23
  • The docs, say, "This algorithm offers n*log(n) performance on many data sets", not all data sets. Suggests that it isn't O(n log n). What's `b`? Introsort would give that guarantee, but wasn't invented until 1997, 4 years after the paper cited in the Java docs. – Steve Jessop Nov 29 '10 at 15:22
  • i think he wanted to write n * log ( n ) :) – mr. Vachovsky Nov 29 '10 at 18:31
0

Arrays.sort() does not use quick-sort. Java 7 used TimSort which is a combination of Merge Sort and Insertion Sort. Java 8 uses parallel-sort when there are more number of elements, and uses multiple threads for sorting. Else it uses TimSort.

Thus the worst case time complexity is always O(nlogn)

Jalaj Varshney
  • 131
  • 1
  • 9