9

Java 8 introduces a parallel algorithm for multi-threaded sorting of arrays, in the form of the overloaded Arrays.sort() methods.

Why does it not also provide a Collections.parallelSort(), for multi-threaded sorting of a List?

chiastic-security
  • 20,430
  • 4
  • 39
  • 67
  • 4
    `list.parallelStream().sorted().collect(...)`? – Oleg Estekhin Dec 23 '14 at 18:40
  • The nature of `List`'s allows them to be sorted differently: It is easy to cut and insert elements or whole ascending "runs" from anywhere to anywhere within a list. Insertion or deletion in an array would require moving over every element after it. So in parallel sort for arrays, elements are sorted in their local cluster and then merged globally, but there is no reason to constrain a `List` sort to the same mechanism. – Iwillnotexist Idonotexist Dec 23 '14 at 18:54
  • 5
    Actually this isn't a duplicate; it's a rationale question, not a how-to question. Arrays admit easy parallel sorting because they can be easily partitioned into chunks that threads can work on independently with known performance characteristics for data access, so we know that in-place parallel sorting is actually practical. For List, we have no such guarantees that in-place parallel sorting will be practical. – Brian Goetz Dec 23 '14 at 20:45
  • @BrianGoetz yes, I agree. I have voted to reopen. – chiastic-security Dec 23 '14 at 20:49
  • @BrianGoetz I had considered this but discounted it. For arrays, yes, partitioning is O(1); But for lists this is still O(n), much less than O(n log n). And just as `Arrays.parallelSort()` single-threads below a size threshold, so could a `List` version. – Iwillnotexist Idonotexist Dec 23 '14 at 22:10
  • @BrianGoetz: Isn't that what the `RandomAccess` marker interface is for? Or do you need more than it promises for efficient sorting? – Jeffrey Bosboom Dec 28 '14 at 06:03
  • 1
    @JeffreyBosboom For data structures other than arrays, in-place sorting is fraught with so many "ifs" and "maybes" that it would be hard to ensure good parallelism. For arbitrary lists, you can use `list.stream().sorted().collect(...)`. I understand how it would be attractive to have a magic "parallel sort for lists" but on the other hand no one would be happy with an implementation that gets a poor parallel speedup. – Brian Goetz Dec 28 '14 at 06:10

1 Answers1

3

A List does not necessarily allow efficient implementation of the same parallel sorting algorithms that an array does. You may be able to directly apply it to an ArrayList, but most likely not to a LinkedList, due to its lack of efficient random access. There are efficient multi-threaded sorting algorithms for that kind of list, but they are different from a random-access list.

And, in fact, the thread-safe implementation of the List interface may not support efficient external multi-threaded sorting at all, due to synchronization. Providing a generic sorting algorithm for those would be impossible, and in fact a parallel algorithm might be slower on them than a sequential algorithm.

Wormbo
  • 4,978
  • 2
  • 21
  • 41
  • Since List is an interface, and LinkedList a child of it can not support parallel sorting due to this answer, while ArrayList can support, and due this issue List does not have parallel sorting without using stream. And so we need to do list.parallelStream().storted().collect(Collectors.toList()); – Shridutt Kothari Feb 21 '20 at 11:49