I have to use N(i=1..N) threads to sort an array of M numbers,each thread start to sort the array from position (N*i)%m. Could anyone help me?
-
If each thread only looks at one part of the array, it is impossible to properly sort the entire array. – jjnguy Oct 28 '10 at 20:34
-
"...each thread **start** to sort the array from..." – aioobe Oct 28 '10 at 20:39
-
@aioobe, I assume that means that the thread will start at `(N*i)%M` and end at `(N*i)%M + M/N)`. – jjnguy Oct 28 '10 at 20:40
-
Anyway... I'm sure the OP is interested in a completely sorted array. – aioobe Oct 28 '10 at 20:43
3 Answers
What you will want to do is use a divide and conquer sorting method like quick sort.
What you will want to do is partition the array, and then pass the two halves of the array off to another thread to do the processing.
Say you have the number:
11 43 24 56 12 65 90 12 53 23
In one thread, you will partition the numbers:
12 24 11 23 12 | 65 90 53 56 43
Then you can perform quick sort on each half of the array in a different thread.
Allow me to provide some code:
public void multiThreadSort(int threads, int[] arr, int start, int stop) {
if (threads > 1) {
int midpoint = partition(arr, start, stop);
new Thread(){public void run() {
multiThreadSort(threads - 1, arr, start, midpoint);
}}.start();
new Thread(){public void run() {
multiThreadSort(threads - 1, arr, midpoint, stop);
}}.start();
}
else
Arrays.sort(arr, start, stop);
}
public int partition(int[] arr, int start, int stop);
Then call it like so:
multiThreadSort(N, arr, 0, arr.length());

- 136,852
- 53
- 295
- 323
-
3I like that this answer actually answers the question, not try and tell the asker that its a bad idea (which for small cases, it is). – Anthony Oct 28 '10 at 20:05
-
@Shy, yeah, in small cases it's a bad idea. But for large datasets and educational purposes, it's a good idea. – jjnguy Oct 28 '10 at 20:08
-
Hmm.. What do you do when you have no more threads to "delegate" to? Do you start sorting the remaining chunks with a single-threaded algorithm? – aioobe Oct 28 '10 at 20:15
-
@aioobe, yup. You just sort the remaining chunks on that thread. And, because everything is already partitioned, you don't have to merge. – jjnguy Oct 28 '10 at 20:17
-
While this would indeed do a threaded sort of the array, it neither limits the sort to N threads, nor have threads start to sort at position (N*i)%m. The former could be fixed by utilize a fixedThreadPool(N), the latter cannot be fixed using quicksort. – Luke Hutteman Oct 28 '10 at 20:19
-
To use the specified partitioning, you would need to use mergesort instead, starting N threads to sort their individual partitions, and then merging the results of these sorts. – Luke Hutteman Oct 28 '10 at 20:20
-
Ah, ok, I see. Thats nice. But it utilizes the threads poorly during the partitioning phase I suppose? Since only one thread would be working... but perhaps it pays off in the end. – aioobe Oct 28 '10 at 20:21
-
@Luke, on an input of size N^2 you can keep spawning threads, until you have created N seperate threads, and then you will just do a single threaded sort. This will end up making each thread sort starting from some position `(N*i)%m`. – jjnguy Oct 28 '10 at 20:22
-
@jinguy yes, but this is a very specific case as far as the input-size is concerned. Also, partitioning in quicksort does not typically happen at the exact mid-point (which causes quicksort to have a worst-case performance of O(N^2)) – Luke Hutteman Oct 28 '10 at 21:04
-
@Luke, you have convinced me that my answer is perhaps not the perfect solution the the OPs problem. But, I will hold that is is the best sorting approach for multiple threads. – jjnguy Oct 28 '10 at 21:06
-
agreed; quicksort in general is faster than mergesort, and also has the additional advantage that it can be done in-place. While it does not fit the OP's problem perfectly, it would certainly be the approach I would take as well to thread a sorting algorithm. – Luke Hutteman Oct 29 '10 at 14:33
-
Noob here. i do not understand how partition(int[] arr, int start, int stop) method actually works. in given example all elements in first partition is less then the any element's value in second partition. how does that sorting happens? – Ankit Jan 09 '16 at 05:00
You could let each thread sort its own portion of the array using Arrays.sort(arr, fromIndex, toIndex)
and after all threads are done, simply use a merge sort to merge the different results. (Even this step could be parallelized by merging multiple portions at a time.)
Here is a related question / good answer.
-
If you are multi-threading for performance, performing full sorts on each part, and then merging them isn't the best option. – jjnguy Oct 28 '10 at 19:58
-
Well, there is a better option than fully sorting and then merging. A distributed quick sort will perform better. (on large data sets) – jjnguy Oct 28 '10 at 20:11
-
1
-
But, if each thread can only look at one small part of the array, it is impossible to properly sort the entire thing. – jjnguy Oct 28 '10 at 20:36
-
@jinguy: mergesort uses a divide-and-conquer approach just like quicksort; the difference is that were quicksort uses an O(N) step to partition the array prior to spawning the left- and right- sorts, mergesort sorts both first and then uses an O(N) step to merge the sorted arrays. – Luke Hutteman Oct 28 '10 at 21:01
-
@Luke, I understand that. What I'm not sure about in general is: How is the array going to be fully sorted if each thread can only look at a part of the array? – jjnguy Oct 28 '10 at 21:05
Another question is why do you want to do this? Thread creation and destruction is rather heavy, sorting (if you can keep your set in-memory) is rather quick. Doing this make sense only if you have the threads pre-existing in a thread pool for some other reason, if the time to sort M is much greater than the time to create (and probably to destroy) a thread, or if this is an academic exercise to learn about thread manipulation (in which case you shouldn't be asking here). If the time to sort M is greater than the time to create a thread you are probably going to have part of your array in virtual memory and disk paging effects will dominate your performance.
Threads are very useful, even essential, for some algorithms. Sorting is not a good application. Except, as mentioned, as a learning exercise to get experience writing software where your threads play well together.

- 738
- 3
- 8
-
2
-
"Thread creation and destruction is rather heavy" you have a reference on that? I believe it is synchronization between threads that is expensive. – aioobe Oct 28 '10 at 20:03
-
@aioobe, if you don't need to synchronize the threads, this isn't a problem. – jjnguy Oct 28 '10 at 20:12
-
Creating a thread requires creation of a Thread object (which requires memory allocation and cycles to check its liveness), creation of an underlying OS thread (impact varies by OS), and adding the thread into the scheduler. None of these processor cycles are required when using a reasonable data structure to implement an in-process quicksort. And @jjnguy, I did say that large arrays could benefit when the time to sort is larger than the time to create a thread. Until you come up with some actual benchmark data I will stand by my assertion that multithreading and sorting don't go together. – verisimilidude Oct 28 '10 at 20:41