110

I am using JDK-8 (x64). For Arrays.sort (primitives) I found the following in the Java documentation:

The sorting algorithm is a Dual-Pivot Quicksort by Vladimir Yaroslavskiy, Jon Bentley, and Joshua Bloch.`

For Collections.sort (objects) I found this "Timsort":

This implementation is a stable, adaptive, iterative mergesort ... This implementation dumps the specified list into an array, sorts the array, and iterates over the list resetting each element from the corresponding position in the array.

If Collections.sort uses an array, why doesn't it just call Arrays.sort or use dual-pivot QuickSort? Why use Mergesort?

rogerdpack
  • 62,887
  • 36
  • 269
  • 388
Quest Monger
  • 8,252
  • 11
  • 37
  • 43
  • 10
    That's the javadoc for arrays of primitives - arrays of Objects are sorted using meregsort. – assylias Sep 01 '15 at 14:36
  • 2
    mergesort gives u nlogn always while quicksort may sometime give nlogn2 geneally arrays size is not that big but collections easily goes upto millions of entries so taking a risk of nlogn2 is not worth P.S. nlogn2 i meant sqaure of n – Kumar Saurabh Sep 01 '15 at 14:37
  • O(n^2) for quicksort is extreme worst-case. In practice it is faster – James Wierzba Sep 01 '15 at 14:38
  • but u cant ignore those caese while making an api – Kumar Saurabh Sep 01 '15 at 14:38
  • Sure you can, when the average case performs better than it's counterparts: merge sort and heap sort (which it does). I guess if you are implementing some extremely time-critical system, it may not be good. It depends on the context I suppose – James Wierzba Sep 01 '15 at 14:40
  • @KumarSaurabh why should an array have less entries than a collecion? Both can have max int values... – Puce Sep 01 '15 at 14:41
  • i didnt say they must i said in general practice. – Kumar Saurabh Sep 01 '15 at 14:59
  • 2
    [This link](http://stackoverflow.com/questions/15154158/why-collections-sort-uses-merge-sort-instead-of-quicksort) is very related. – qartal Mar 20 '16 at 21:54

5 Answers5

124

The API guarantees a stable sorting which Quicksort doesn’t offer. However, when sorting primitive values by their natural order you won’t notice a difference as primitive values have no identity. Therefore, Quicksort can used for primitive arrays and will be used when it is considered more efficient¹.

For objects you may notice, when objects with different identity which are deemed equal according to their equals implementation or the provided Comparator change their order. Therefore, Quicksort is not an option. So a variant of MergeSort is used, the current Java versions use TimSort. This applies to both, Arrays.sort and Collections.sort, though with Java 8, the List itself may override the sort algorithms.


¹ The efficiency advantage of Quicksort is needing less memory when done in-place. But it has a dramatic worst case performance and can’t exploit runs of pre-sorted data in an array, which TimSort does.

Therefore, the sorting algorithms were reworked from version to version, while staying in the now-misleadingly named class DualPivotQuicksort. Also, the documentation didn’t catch up, which shows, that it is a bad idea in general, to name an internally used algorithm in a specification, when not necessary.

The current situation (including Java 8 to Java 11) is as follows:

  • Generally, the sorting methods for primitive arrays will use Quicksort only under certain circumstances. For larger arrays, they will try to identify runs of pre-sorted data first, like TimSort does, and will merge them when the number of runs does not exceed a certain threshold. Otherwise they will fall back to Quicksort, but with an implementation that will fall back to Insertion sort for small ranges, which does not only affect small arrays, but also quick sort’s recursion.
  • sort(char[],…) and sort(short[],…) add another special case, to use Counting sort for arrays whose length exceeds a certain threshold
  • Likewise, sort(byte[],…) will use Counting sort, but with a much smaller threshold, which creates the biggest contrast to the documentation, as sort(byte[],…) never uses Quicksort. It only uses Insertion sort for small arrays and Counting sort otherwise.
Holger
  • 285,553
  • 42
  • 434
  • 765
  • 1
    Hmm, interestingly the Collections.sort Javadoc states: "This sort is guaranteed to be stable", but since it delegates to List.sort, which can be overridden by list implementations, stable sorting cannot really be gauranteed by Collections.sort for all list implementations. Or do I miss something? And List.sort doesn't require the sorting alogirthm to be stable. – Puce Sep 01 '15 at 15:03
  • 11
    @Puce: that simply means that the responsibility for that guaranty now lies in the hands of those who implement the overriding `List.sort` method. `Collections.sort` could never guaranty correct working for every `List` implementation as it can’t guaranty, e.g. that the `List` doesn’t spuriously changes its contents. It all boils down to that the guaranty of `Collections.sort` only applies to correct `List` implementations (and correct `Comparator` or `equals` implementations). – Holger Sep 01 '15 at 15:14
  • 1
    @Puce: But you are right, the Javadoc isn’t equally explicit about this constraint in both methods But at least the most recent documentation states that `Collections.sort` will delegate to `List.sort`. – Holger Sep 01 '15 at 15:18
  • @Puce: there are tons of examples of this, where important properties are not part of the type but rather only mentioned in the documentation (and thus not checked by the compiler). Java's type system is simply too weak to express any interesting properties. (It's not much different from a dynamically typed language in this regard, there, too, properties are defined in the documentation and it is up to the programmer to make sure they are not violated.) It goes even further, actually: did you notice that `Collections.sort` doesn't even mention in its type signature that the output is sorted? – Jörg W Mittag Sep 01 '15 at 21:58
  • 1
    In a language with a more expressive type system, the return type of `Collections.sort` would be something like "a collection of the same type and length as the input with the properties that 1) every element present in the input is also present in the output, 2) for every pair of elements from the output, the left one is no greater than the right one, 3) for every pair of equal elements from the output, the left one's index in the input is smaller than the right one's" or something like that. – Jörg W Mittag Sep 01 '15 at 22:01
  • The most obvious examples of the weakness of Java's type system are `Serializable` and `Cloneable`, where the *entire* contract is *solely* in the documentation, the interfaces are completely identical (because they are both completely *empty*) and thus *should* mean the same thing, but don't. – Jörg W Mittag Sep 01 '15 at 22:03
  • @JörgWMittag My point is that "stable sorting algorithm" is **not** required by the contract of List.sort, but Collections.sort, which depends on List.sort, guarantees it!?? – Puce Sep 01 '15 at 23:10
  • @JörgWMittag I think it looks like a documentation bug, which got missed when they introduced the default sort implementation on the List interface. Either List.sort should state that a stable sorting algorithm is required or Collections.sort cannot generally guarantee stable sorting for any List implementation. – Puce Sep 01 '15 at 23:17
  • There are some who call this sort... Tim. – Etheryte Sep 02 '15 at 08:19
  • I have a vague memory that I've read about another reason for the two different algorithms which I'd like to add: For objects, where a compare method is used, compares are more expansive than swaps. For primitives they are equally expensive. Therefore merge sort -- which minimizes the number of compares -- is better for objects, while quick sort -- which minimizes the number of swaps -- is better for primitives. – Lii Sep 07 '15 at 07:31
20

I don't know about the documentation, but the implementation of java.util.Collections#sort in Java 8 (HotSpot) goes like this:

@SuppressWarnings({"unchecked", "rawtypes"})
public static <T> void sort(List<T> list, Comparator<? super T> c) {
    list.sort(c);
}

And List#sort has this implementation:

@SuppressWarnings({"unchecked", "rawtypes"})
default void sort(Comparator<? super E> c) {
    Object[] a = this.toArray();
    Arrays.sort(a, (Comparator) c);
    ListIterator<E> i = this.listIterator();
    for (Object e : a) {
        i.next();
        i.set((E) e);
    }
}

So, in the end, Collections#sort uses Arrays#sort (of object elements) behind the scenes. This implementation uses merge sort or tim sort.

Luiggi Mendoza
  • 85,076
  • 16
  • 154
  • 332
17

According to the Javadoc, only primitive arrays are sorted using Quicksort. Object arrays are sorted with a Mergesort as well.

So Collections.sort seems to use the same sorting algorithm as Arrays.sort for Objects.

Another question would be why a different sort algorithm is used for primitive arrays than for Object arrays?

Puce
  • 37,247
  • 13
  • 80
  • 152
2

As stated across many of the answers.

The Quicksort is used by Arrays.sort for sorting primitive collections because stability isn't required (you won't know or care if two identical ints were swapped in the sort)

MergeSort or more specifically Timsort is used by Arrays.sort for sorting collections of objects. Stability is required. Quicksort does not provide for stability, Timsort does.

Collections.sort delegates to Arrays.sort which is why you see the javadoc referencing the MergeSort.

cogitoboy
  • 114
  • 1
  • 4
1

Quick Sort has two major drawbacks when it comes to merge sort:

  • It's not stable while it comes to non primitive.
  • It doesn't guarantee n log n performance.

Stability is a non-issue for primitive types, as there is no notion of identity as distinct from (value) equality.

Stability is a big deal when sorting arbitrary objects. It's a nice side benefit that Merge Sort guarantees n log n (time) performance no matter what the input. That's why merge sort is selected to provide a stable sort (Merge Sort) to sort object references.

Krutik
  • 1,107
  • 13
  • 18