1

Most examples I find of Merge Sort run in a single thread. This defeats some of the advantage of using the Merge Sort algorithm in the first place. Can someone show the proper way to write a Merge Sort algorithm in java using multi-threading.

The solution should use features from the latest version of java as applicable. Many Solutions already on Stackoverflow use plain Threads. I'm looking for a solution demonstrating a ForkJoin with a RecursiveTask, which would seem to be the primary use case the RecursiveTask class was intended for.

Emphasis should be on showing an algorithm with superior performance characteristics including both time and space complexity where possible.

NOTE: Neither of the proposed duplicate questions apply since neither provide a solution using Recursive Task which was specifically what this question was asking for.

  • 1
    Here is an answer using RecursiveAction (if you cannot use Java 8): https://stackoverflow.com/a/2210249/14955 – Thilo Apr 26 '18 at 06:40
  • @Thilo Thanks, I wanted a demonstration of RecursiveTask but that link may be helpful to others. – Jeffrey Phillips Freeman Apr 26 '18 at 06:41
  • 3
    About this statement in the question: "*This defeats the advantage of using the Merge Sort algorithm in the first place*" This is not the case at all. The "purpose" of merge sort is not to be a multi-threaded algorithm. It is a great sorting algorithm also in a single thread environment. – Lii Apr 26 '18 at 07:05
  • 1
    @Lii Thanks edited the question to correct for that. – Jeffrey Phillips Freeman Apr 26 '18 at 07:21
  • @mark-rotteveel this is not a duplicate, those links do not satisfy the criteria posted in the question. – Jeffrey Phillips Freeman Apr 26 '18 at 18:12
  • @Thilo Some help removing the duplicate flag maybe? Neither of those links provide a RecursiveTask solution as the question requested. – Jeffrey Phillips Freeman Apr 26 '18 at 18:25
  • @JeffreyPhillipsFreeman You yourself voted to close those other questions as duplicate of your own, so in your opinion they were duplicates of your question. I only reversed the direction of these votes to the older ones. You cannot have your cake and it eat it too... – Mark Rotteveel Apr 27 '18 at 09:14
  • @MarkRotteveel Don't be so dramatic. As I explained the duplicate flag only made sense in one direction. This question is more specific than the other. You are being absurd. – Jeffrey Phillips Freeman Apr 27 '18 at 20:41
  • I'm not the one here being absurd. If the other questions were duplicates of yours, then so is yours a duplicate of them. If that isn't the case, then you shouldn't have voted to close them as duplicate. I have just taken that self-admitted duplication, and favored the more established questions as target, as that is the common rule when applying duplicate votes. If you now change the argument that your question is "more specific", then you shouldn't have voted to close as duplicate those more general questions. – Mark Rotteveel Apr 27 '18 at 20:46
  • @MarkRotteveel Obviously I'm new here and it was the first time I was using the flag system. My understanding, and perhaps incorrect, is that it is a one-way flag indicating that some question supplies an answer to the current one. If it is not the appropriate flag when one question is simply more specific than the other, so be it, I therefore made the wrong choice. The point of voting is I would have expected others with more expiernce to catch that rather than to be on some high-horse about "teaching a lesson". It all just seems like a very childish response. – Jeffrey Phillips Freeman Apr 27 '18 at 21:16
  • I wouldn't call a 2+ years old account new, but ok. If you want to link between questions, then consider posting a comment to that effect, don't vote to close as duplicate unless you really consider them duplicates. In any case, you might want to consider flagging your answer for moderator attention and have it moved to [Multithreaded quicksort or mergesort](https://stackoverflow.com/questions/2210185/multithreaded-quicksort-or-mergesort) (which already has an answer that uses fork/join + a recursive action). – Mark Rotteveel Apr 27 '18 at 21:24
  • @MarkRotteveel for those 2 years my account was mostly idle. But your right a comment may have been better. I figured a process of voting would have figured that out for itself. I dont particularly care enough to bring in a moderator. Just seems very childish on your part. – Jeffrey Phillips Freeman Apr 27 '18 at 22:16

1 Answers1

4

The most convenient multi-threading paradigm for a Merge Sort is the fork-join paradigm. This is provided from Java 8 and later. The following code demonstrates a Merge Sort using a fork-join.

import java.util.*;
import java.util.concurrent.*;

public class MergeSort<N extends Comparable<N>> extends RecursiveTask<List<N>> {
    private List<N> elements;

    public MergeSort(List<N> elements) {
        this.elements = new ArrayList<>(elements);
    }

    @Override
    protected List<N> compute() {
        if(this.elements.size() <= 1)
            return this.elements;
        else {
            final int pivot = this.elements.size() / 2;
            MergeSort<N> leftTask = new MergeSort<N>(this.elements.subList(0, pivot));
            MergeSort<N> rightTask = new MergeSort<N>(this.elements.subList(pivot, this.elements.size()));

            leftTask.fork();
            rightTask.fork();

            List<N> left = leftTask.join();
            List<N> right = rightTask.join();

            return merge(left, right);
        }
    }

    private List<N> merge(List<N> left, List<N> right) {
        List<N> sorted = new ArrayList<>();
        while(!left.isEmpty() || !right.isEmpty()) {
            if(left.isEmpty())
                sorted.add(right.remove(0));
            else if(right.isEmpty())
                sorted.add(left.remove(0));
            else {
                if( left.get(0).compareTo(right.get(0)) < 0 )
                    sorted.add(left.remove(0));
                else
                    sorted.add(right.remove(0));
            }
        }

        return sorted;
    }

    public static void main(String[] args) {
        ForkJoinPool forkJoinPool = ForkJoinPool.commonPool();
        List<Integer> result = forkJoinPool.invoke(new MergeSort<Integer>(Arrays.asList(7,2,9,10,1)));
        System.out.println("result: " + result);
    }
}

While much less straight forward the following varient of the code eliminates the excessive copying of the ArrayList. The initial unsorted list is only created once, and the calls to sublist do not need to perform any copying themselves. Before we would copy the array list each time the algorithm forked. Also, now, when merging lists instead of creating a new list and copying values in it each time we reuse the left list and insert our values into there. By avoiding the extra copy step we improve performance. We use a LinkedList here because inserts are rather cheap compared to an ArrayList. We also eliminate the call to remove, which can be expensive on an ArrayList as well.

import java.util.*;
import java.util.concurrent.*;

public class MergeSort<N extends Comparable<N>> extends RecursiveTask<List<N>> {
    private List<N> elements;

    public MergeSort(List<N> elements) {
        this.elements = elements;
    }

    @Override
    protected List<N> compute() {
        if(this.elements.size() <= 1)
            return new LinkedList<>(this.elements);
        else {
            final int pivot = this.elements.size() / 2;
            MergeSort<N> leftTask = new MergeSort<N>(this.elements.subList(0, pivot));
            MergeSort<N> rightTask = new MergeSort<N>(this.elements.subList(pivot, this.elements.size()));

            leftTask.fork();
            rightTask.fork();

            List<N> left = leftTask.join();
            List<N> right = rightTask.join();

            return merge(left, right);
        }
    }

    private List<N> merge(List<N> left, List<N> right) {
        int leftIndex = 0;
        int rightIndex = 0;
        while(leftIndex < left.size() || rightIndex < right.size()) {
            if(leftIndex >= left.size())
                left.add(leftIndex++, right.get(rightIndex++));
            else if(rightIndex >= right.size())
                return left;
            else {
                if( left.get(leftIndex).compareTo(right.get(rightIndex)) < 0 )
                    leftIndex++;
                else
                    left.add(leftIndex++, right.get(rightIndex++));
            }
        }

        return left;
    }

    public static void main(String[] args) {
        ForkJoinPool forkJoinPool = ForkJoinPool.commonPool();
        List<Integer> result = forkJoinPool.invoke(new MergeSort<Integer>(Arrays.asList(7,2,9,-7,777777,10,1)));
        System.out.println("result: " + result);
    }
}

We can also improve the code one step further by using iterators instead of calling get directly when performing the merge. The reason for this is that get on a LinkedList by index has poor time performance (linear) so by using an iterator we eliminate slow-down caused by internally iterating the linked list on each get. The call to next on an iterator is constant time as opposed to linear time for the call to get. The following code is modified to use iterators instead.

import java.util.*;
import java.util.concurrent.*;

public class MergeSort<N extends Comparable<N>> extends RecursiveTask<List<N>> {
    private List<N> elements;

    public MergeSort(List<N> elements) {
        this.elements = elements;
    }

    @Override
    protected List<N> compute() {
        if(this.elements.size() <= 1)
            return new LinkedList<>(this.elements);
        else {
            final int pivot = this.elements.size() / 2;
            MergeSort<N> leftTask = new MergeSort<N>(this.elements.subList(0, pivot));
            MergeSort<N> rightTask = new MergeSort<N>(this.elements.subList(pivot, this.elements.size()));

            leftTask.fork();
            rightTask.fork();

            List<N> left = leftTask.join();
            List<N> right = rightTask.join();

            return merge(left, right);
        }
    }

    private List<N> merge(List<N> left, List<N> right) {
        ListIterator<N> leftIter = left.listIterator();
        ListIterator<N> rightIter = right.listIterator();
        while(leftIter.hasNext() || rightIter.hasNext()) {
            if(!leftIter.hasNext()) {
                leftIter.add(rightIter.next());
                rightIter.remove();
            }
            else if(!rightIter.hasNext())
                return left;
            else {
                N rightElement = rightIter.next();
                if( leftIter.next().compareTo(rightElement) < 0 )
                    rightIter.previous();
                else {
                    leftIter.previous();
                    leftIter.add(rightElement);
                }
            }
        }

        return left;
    }

    public static void main(String[] args) {
        ForkJoinPool forkJoinPool = ForkJoinPool.commonPool();
        List<Integer> result = forkJoinPool.invoke(new MergeSort<Integer>(Arrays.asList(7,2,9,-7,777777,10,1)));
        System.out.println("result: " + result);
    }
}

Finally the most complex versions of the code, this iteration uses an entirely in-place operation. Only the initial ArrayList is created and no additional collections are ever created. As such the logic is particularly difficult to follow (so i saved it for last). But should be as close to an ideal implementation as we can get.

import java.util.*;
import java.util.concurrent.*;

public class MergeSort<N extends Comparable<N>> extends RecursiveTask<List<N>> {
    private List<N> elements;

    public MergeSort(List<N> elements) {
        this.elements = elements;
    }

    @Override
    protected List<N> compute() {
        if(this.elements.size() <= 1)
            return this.elements;
        else {
            final int pivot = this.elements.size() / 2;
            MergeSort<N> leftTask = new MergeSort<N>(this.elements.subList(0, pivot));
            MergeSort<N> rightTask = new MergeSort<N>(this.elements.subList(pivot, this.elements.size()));

            leftTask.fork();
            rightTask.fork();

            List<N> left = leftTask.join();
            List<N> right = rightTask.join();

            merge(left, right);
            return this.elements;
        }
    }

    private void merge(List<N> left, List<N> right) {
        int leftIndex = 0;
        int rightIndex = 0;
        while(leftIndex < left.size() ) {
            if(rightIndex == 0) {
                if( left.get(leftIndex).compareTo(right.get(rightIndex)) > 0 ) {
                    swap(left, leftIndex++, right, rightIndex++);
                } else {
                    leftIndex++;
                }
            } else {
                if(rightIndex >= right.size()) {
                    if(right.get(0).compareTo(left.get(left.size() - 1)) < 0 )
                        merge(left, right);
                    else
                        return;
                }
                else if( right.get(0).compareTo(right.get(rightIndex)) < 0 ) {
                    swap(left, leftIndex++, right, 0);
                } else {
                    swap(left, leftIndex++, right, rightIndex++);
                }
            }
        }

        if(rightIndex < right.size() && rightIndex != 0)
            merge(right.subList(0, rightIndex), right.subList(rightIndex, right.size()));
    }

    private void swap(List<N> left, int leftIndex, List<N> right, int rightIndex) {
        //N leftElement = left.get(leftIndex);
        left.set(leftIndex, right.set(rightIndex, left.get(leftIndex)));
    }

    public static void main(String[] args) {
        ForkJoinPool forkJoinPool = ForkJoinPool.commonPool();
        List<Integer> result = forkJoinPool.invoke(new MergeSort<Integer>(new ArrayList<>(Arrays.asList(5,9,8,7,6,1,2,3,4))));
        System.out.println("result: " + result);
    }
}