0

I'm trying to implement the merge sorting using only the wait/notify synchronization. I'm aware of more high-level constructions like Fork/Join, Executors. etc. But I need to use work/notify here. Based on this https://courses.cs.washington.edu/courses/cse373/13wi/lectures/03-13/ I refactored the method parallelMergeSort() with synchronized blocks:

public void parallelMergeSort() {
  synchronized (values) {
     if (threadCount <= 1) {
        mergeSort(values);
        values.notify();
     } else if (values.length >= 2) {
        // split array in half
       int[] left = Arrays.copyOfRange(values, 0, values.length / 2);
       int[] right = Arrays.copyOfRange(values, values.length / 2, values.length);
       synchronized(left) {
         synchronized (right) {
            // sort the halves
            // mergeSort(left);
            // mergeSort(right);
           Thread lThread = new Thread(new ParallelMergeSort(left, threadCount / 2));
           Thread rThread = new Thread(new ParallelMergeSort(right, threadCount / 2));
           lThread.start();
           rThread.start();
           /*try {
             lThread.join();
             rThread.join();
           } catch (InterruptedException ie) {}*/
           try {
             left.wait();
             right.wait();
           } catch (InterruptedException e) {
             e.printStackTrace();
           }
           // merge them back together
           merge(left, right, values);
        }
      }
      values.notify();
    }
  }
}

values is an input array here.

As a result I see the performance of sorting is down and it's slower even than a single-thread sorting. I guess that a bottleneck is within two synchronization blocks of left and right parts of an array. Does someone know how to refactor it in order to make it faster than the single-thread sorting?

kolya_metallist
  • 589
  • 9
  • 20
  • Why are you synchronizing of the left and right Array? The idea is that the Threads each sort another part of the Array so there should be no need for synchronizing. – Frank Andres Apr 07 '15 at 13:16
  • @Fank but I need to wait these threads before merging right and left parts, and wait() should be within synchronization section. – kolya_metallist Apr 07 '15 at 13:22
  • @kolya Well I don't think wait() is particularly useful here. Wait is used when 1 thread wants to access a resource that might be locked by another thread parallel algorithms are trying to avoid this by working on different parts of the data simultaniously. What you want is probably join. Wait/notify mechanism is useful when needing a Mutex or Semaphor like for a supplier/consumer problem. – Frank Andres Apr 07 '15 at 13:34
  • 2
    Your use of `left.wait()` and `right.wait()` is wrong. `o.wait()` and `o.notify()` are primitive operations that are meant to be used in a very specific way. If you don't use them as intended, then you program will be vulnerable to the so-called "lost notification" bug: http://stackoverflow.com/questions/27554127/avoid-waiting-for-a-terminated-thread/27556424#27556424 – Solomon Slow Apr 07 '15 at 13:39
  • @james large @Frank - yes, I know that `wait()` and `notify()` are primitive operations and it's better to use `join()` here as it is implemented in the original source. I just wanted to clarify if there is an opportunity to solve this task with these primitives since the search by "mergesort" & "wait/notify" doesn't return relevant results. – kolya_metallist Apr 07 '15 at 13:51
  • You misunderstand what I am saying. Multithreading, by definition, is unpredictable. When your algorithm calls `lThread.start()`, it is _possible_ for the new thread to finish its work and call `values.notify()` before the parent thread calls `left.wait()`. If that happens, then the parent thread will wait forever because it missed the notification. Possible, but not guaranteed. You could test your program a thousand times on your own PC, and it could work every time, and then you give it to your instructor, and when your instructor runs it on a different PC it could hang. – Solomon Slow Apr 07 '15 at 14:11
  • I'm not sure what your instructor is trying to teach, but (s)he is teaching you _bad_ software engineering by asking you to use `wait()` and `notify()` to solve this problem. The problem calls for a higher level of abstraction. If I were given the assignment, I would use `wait()` and `notify()` to implement a higher-level synchronization class---maybe implement my own `Semaphore`---and then I would use that class to solve the problem. – Solomon Slow Apr 07 '15 at 14:18
  • @james large. thanks, now it's clear what you meant under the "lost notification". So, we can obtain more serious problems here than slower execution time. – kolya_metallist Apr 07 '15 at 15:13

2 Answers2

1

The problem lies in your nested synchronized blocks:

synchronized(left) {
   synchronized (right) {
       Thread lThread = new Thread(…);
       Thread rThread = new Thread(…);
       lThread.start();
       rThread.start();
       try {
         left.wait();
         right.wait();
       }
       …

You are holding both locks when you start the new threads which in turn try to acquire these locks. Therefore your new threads are blocked until the initiating thread releases these locks. This happens implicitly when the thread calls wait() but … you can only wait for one condition at a time!

So when the initiating thread calls left.wait(), it releases the lock of the left instance and the sub-thread spawned for processing the left part can proceed but the initiating thread still holds the right lock while waiting on left. Once the sub-thread has finished processing left it will call notify, followed by releasing the lock of left which allows wait() to re-acquire it and return.

Then the initiating thread can call right.wait() which will release the right lock and allows the other sub-thread to start its work, hence the equal to sequential performance. For each spawning of sub-threads the sub-threads are enforced to run one after another due to the locks held by the initiating thread.

One way to fix this would be launching the threads first and acquiring the locks afterwards and only the one lock you are about to wait rather then nesting the synchronized blocks. This is still subject to the unspecified timing (now, a sub-thread may have finished its work and called notify even before you enter the synchronized(x) { x.wait(); } block) and the so-called spurious wakeups. Simply said, you need a verifiable condition which is checked before and after calling wait() as explained in the documentation of wait():

As in the one argument version, interrupts and spurious wakeups are possible, and this method should always be used in a loop:

synchronized (obj) {
    while (<condition does not hold>)
        obj.wait();
    ... // Perform action appropriate to condition
}

The condition might be a boolean flag set to true by the sub-thread right before calling notify() to signal that the work is done.

Note that this all is what you get for free when using Thread.join(). The synchronization happens within the join() method and these two invocations cannot overlap. Further, the implementation uses a verifiable condition (the alive state of the thread) to ensure to call wait() only when necessary and to protect itself against “spurious wakeups”.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • I've changed the sync blocks in the following way: `synchronized (lThread) { while (true) { if (!(lThread.isAlive())) break; lThread.wait(0L); } } synchronized (rThread) { while (true) { if (!(rThread.isAlive())) break; rThread.wait(0L); } }` Added `synchronized` to the whole method and changed notification to `this.notify()` – kolya_metallist Apr 09 '15 at 15:44
  • Is `this` the `Thread` instance you are waiting on? In your question you use delegation. By the way, `while(true) { if(!x) break; … }` can always be replaced with `while(x) { … }` – Holger Apr 10 '15 at 08:23
  • yes, thread. I just looked at the source code of `join()` and reused JDK logic of synchronization. – kolya_metallist Apr 10 '15 at 12:18
0

You will need to sort millions of values to see the effect of the parallelism, if you ever do, because you are copying arrays all over the place and that puts most of the strain on the system on memory access and garbage collection, not on the sorting.

To properly parallelise sorting you would need to do it in-place - which makes a merge sort very unlikely to be a good candidate because it must create a new array for the target.

If all you are doing is experimenting then use a compare/compute intensive algorithm like bubble-sort.

Note that if this has been set as an assignment then your lecturer is probably expecting you to respond with you can't because merge-sort is a bad candidate for parallelism.

OldCurmudgeon
  • 64,482
  • 16
  • 119
  • 213
  • I've tried up to 32m of values like they did there https://courses.cs.washington.edu/courses/cse373/13wi/lectures/03-13/MergeSort.java And it worked good as expected before my changes with wait/notify. – kolya_metallist Apr 07 '15 at 13:35
  • @kolya_metallist - Also remember to limit the number of threads to just below the number of cores you are running on. – OldCurmudgeon Apr 07 '15 at 13:39
  • I'm using `Runtime.getRuntime().availableProcessors();` It returns the 4, I also used 8, It showed the same result. – kolya_metallist Apr 07 '15 at 13:43
  • @kolya_metallist - Exactly - it is very unlikely that you will see any advantage to parallelism with merge-sort because most of the time you are spending copying arrays, the actual sorting/comparing is only a tiny fraction of the overall processing. Merge-sort was designed for sorting huge amounts of data from/to disk (actually tapes at the time but you may not be old enough to know what they are). – OldCurmudgeon Apr 07 '15 at 13:46
  • @kolya_metallist - Take a look [here](http://stackoverflow.com/a/15657134/823393) for a discussion of how you could implement an in-place merge-sort. This would greatly benefit from a parallel approach because it does not move memory around (other than swapping). – OldCurmudgeon Apr 07 '15 at 14:08
  • @OldCurmudgeon Do you think that it'd be more efficient than pre-allocating a working area in advance (see the end of the answer you linked)? Why? IMHO swapping means more memory accesses then just merging into the second array. – maaartinus Apr 07 '15 at 20:24