4

I am multiplying two matrices using two threads (however, the program is written to scale up as well, so I could possibly use three, four, etc threads instead). Each thread calculates/does the work for one row (or column) of the final matrix. If one thread is doing work on a row, the other one(s) should not work on that row. It/they should move on to the next available row.

First of all, I am not certain if the way I implemented the problem is correct. If you can see a better way, please let me know.

Secondly, the way I have done it, every time I test it (with different size matrices--even huge ones), only one thread does the work. That is, each time, the same thread is getting access to the synchronized block of the run() method. The other threads are entering the run() method, but why is only one thread always gaining the lock and doing all of the work?

This is my run method:

 public void run() {
    System.out.println(Thread.currentThread().getName());
    while (i < number of columns in final matrix) {
        synchronized (this) {
            if (i < number of columns in final matrix) {
                for (int j = 0; j < Main.B[0].length; j++) { 
                    for (int k = 0; k < Main.A[0].length; k++) { 
                        Main.C[i][j] += Main.A[i][k] * Main.B[k][j];
                    }
                }
                i++;
            }
        }
    }
} 

This is the code in my driver class that creates the threads and starts the program:

MyRunnable r = new MyRunnable();
Thread thread1 = new Thread(r);
Thread thread2 = new Thread(r);
thread1.start();
thread2.start();

try {
    thread1.join();
    thread2.join();
    } catch (InterruptedException ie) {
        System.out.println("\nThe following error occurred: " + ie);
        }
    }

I guess my question is two-fold--is my approach correct for the problem at hand? If so, (and if not), why is one thread always grabbing the lock and doing all of the work? I have checked the program with up to 6 threads on 20x20 matrices and always only one thread is doing the work.

whistler
  • 876
  • 2
  • 15
  • 31
  • I have always done that: started threads and then immediately waited for them to finish their work. This is the way I have seen it done in textbooks...do you have a suggestion otherwise? – whistler May 05 '12 at 05:55
  • Regarding the comment on four separate locks, it should not be that way. Only one instance of MyRunnable was created (r). I passed each thread that single instance in the thread constructor, so there should only be one lock for all of the threads. If this is not correct, please let me know. – whistler May 05 '12 at 05:56
  • 1
    @whistler you are right re. the locks. The problem is you lock the whole region with `synchronized(this) {`. This means only one thread at a time can enter the code which also means only one thread at a time can calculate – Ulrich Dangel May 05 '12 at 05:58
  • I'm no Java expert either :) but I'm pretty certain (99%) that is not the case with join(). – whistler May 05 '12 at 06:00
  • @mru Thanks. But, all of the calculations are not in the synchronized block. I'm only locking the calculation of ONE row of the final matrix (those two for loops are calculating one row of the product). When one of the threads grabs the lock and finishes the calculation in that block, there are still more rows to calculate. However, the same thread keeps grabbing the lock. – whistler May 05 '12 at 06:03
  • you're right. comments deleted. sorry. – Hovercraft Full Of Eels May 05 '12 at 06:04
  • @mru--could you offer a suggestion of how to change the code to allow the other threads to calculate too? – whistler May 05 '12 at 06:04
  • 3
    One (long) suggestion: (1) If this is going to be a common operation in your application then use some type of Executor in order to avoid the overhead of creating new threads; (2) Use a Callable instead of a Runnable and have the callable take in a row and column and then return the result for that operation; (3) Submit all of the Callables needed for the operation to the Executor; (4) Have the main thread combine the results of each callable into the final result. – lifelongcoug May 05 '12 at 06:14
  • 'This is the way I have seen it done in textbooks...do you have a suggestion otherwise?' Yes - time for a bonfire. I'm not usually a supporter of book-burning but, this time, I'll make an exception. All 'textbooks' that start the threading chapter my mentioning 'join' in the first paragraph should be burned and their authors forced to write javascript for the remainder of their miserable lives. – Martin James May 05 '12 at 10:15

5 Answers5

5

As some of the comments suggested, the problem is in the locking (i.e. the synchronized(this) part). Synchronizing is done on this which, in your case, a single instance of MyRunnable, so while one thread is doing the work inside the synchronized block, all other threads will wait until the work is finished. So effectively, only one thread is doing real work at a time.

Here's how to solve the problem. Since you need your threads to work on different rows in parallel, then this work must not be synchronized by a lock (because locking means the opposite: only one thread can do the work at a time). What you do need to synchronize is the part where each thread decides which row it will work on.

Here's a sample pseudo code:

public void run(){
  int workRow;
  synchronized(this){
    workRow = findNextUnprosessedRow();
  }
  for(int i=0; i<matrix[workRow].length; i++){
    //do the work
  }
}

Note that the actual work is intentionally not synchronized, for reasons given above.

The way you are using threads is correct, so there is not problem with that, however, I would suggest you have a look at Java's concurrency API: Thread Pools. Here's an example of how to use it in your context:

//Creates a pool of 5 concurrent thread workers
ExecutorService es = Executores.newFixedThreadPool(5);

//List of results for each row computation task
List<Future<Void>> results = new ArrayList<Future<Void>>();
try{
  for(int row=0; row<matrix.length; row++){
    final int workRow = row;

    //The main part. You can submit Callable or Runnable
    // tasks to the ExecutorService, and it will run them
    // for you in the number of threads you have allocated.
    // If you put more than 5 tasks, they will just patiently
    // wait for a task to finish and release a thread, then run.
    Future<Void> task = es.submit(new Callable<Void>(){
      @Override
      public Void call(){
        for(int col=0; col<matrix[workRow].length; col++){
          //do something for each column of workRow
        }
        return null;
      }
    });
    //Store the work task in the list.
    results.add(task);
  }
}finally{
  //Make sure thread-pool is shutdown and all worker
  //threads are released. 
  es.shutdown();
}

for(Future<Void> task : results){
  try{
    //This will wait for threads to finish. 
    // i.e. same as Thread.join()
    task.get();
  }catch(ExecutionException e){
    //One of the tasks threw an exception!
    throw new RuntimeException(e);
  }
}

This approach is a lot cleaner, because the work distribution is done the main thread (the outer for-loop), and therefore there is no need to synchronize it.

You also get few bonuses when working with thread pools:

  • It nicely takes care of any exceptions during the computations in each of the threads. When working with bare threads, like in your approach, it is easy to "lose" an exception.

  • Threads are pooled. That is, they get automatically reused so you don't need to worry about the cost of spawning new threads. This is particularly useful in your case, since you will need to spawn a thread per row in your matrix, which may be fairly large, I suspect.

  • Tasks submitted to ExecutorService are wrapped in a useful Future<Result> object, which is most useful when each computation task actually returns some kind of result. In your case, if you needed to sum-up all values in the matrix, then each computation task could return the sum for the row. Then you'd just need to sum up those up.

Got a bit long, but hope it clears some things up.

rodion
  • 14,729
  • 3
  • 53
  • 55
  • Thank you...I appreciate the extra information on Java's concurrency API. That is unfamiliar territory for me, and I truly want to learn more. You have given me a great starting point! – whistler May 05 '12 at 07:47
  • Glad I could be of help. Best of luck! – rodion May 05 '12 at 07:58
4

Your problem is that you synchronize the whole region with synchronized(this). This means that only one thread at a time is allowed to enter the loop doing the calculation. Of course it could mean that multiple threads can calculate different parts but never multiple threads at once. This also means your "parallel" solution it is not faster than one thread.

If you want to do the calculation in parallel have a look at Parallel Matrix Multiplication in Java 6 and Fork Join Matrix Multiplication in Java which should cover the topic

Community
  • 1
  • 1
Ulrich Dangel
  • 4,515
  • 3
  • 22
  • 30
  • That code is a little more complex than what I'm looking to do. I am still learning and that's way over my head. I've already done a simple multithreaded matrix mult program in which m x n threads separately calculate EACH element in the product. Now, the point is to just use 2 threads that do the work (when one thread is working on one row or one element, the other should do a different row). Someone left a comment that my solution is no better than a single thread--and he is correct, I think. How to fix it, though, I'm still not clear on that. – whistler May 05 '12 at 06:21
  • @whistler Have a look at the comment by lifelongcoug part 2. You basically have to split the work, provide the necessary parameters to your worker threads and join it afterwards. – Ulrich Dangel May 05 '12 at 06:26
2

Thread scheduling depends on the particular VM implementation. In some implementations a thread will continue to run until it blocks in some way or is preempted by a higher priority thread. In your case all the threads have the same priority, so the first thread to enter the synchronized block never blocks, it does not get preempted. Some schedulers implement priority aging, such that a starved thread will eventually increase in priority, but you may not be running long enough for that to have an effect.

Add a Thread.yield() call just after the end of the synchronized block. This tells the scheduler to pick a new thread to run (maybe the same one, but probably a different one).

Jim Garrison
  • 85,615
  • 20
  • 155
  • 190
1

Your run function has the first thread to get the lock do all the work on a row while still owning the lock. For the next row, maybe another thread will get the lock, but it will block all other threads until it is done.

What I would do is have an array of booleans that is the same as the number of rows, and use these to claim the task of processing each individual row. It would be something like the following pseudocode:

//before creating the threads, pre-fill BoolList with trues
function run()
{
  while (true)
  {
    lock(BoolList)
    {
      //find first true value and set it to false
      //if no true found, return
    }
    //do the actual math of multiplying the row we claimed above
  }
}

Also keep in mind that the overhead of creating a new thread is sufficient that multi-threading this program would only be worth it for large matrices.

Nathan Wiebe
  • 794
  • 5
  • 12
1

As mru already stated in his comment, you problem is that all row calculation is performed inside "synchronized (this)" block. Because of this all threads will wait for one row to be processed before starting on next one, and same thread always acquiring the lock is probably the result of optimization, since you pretty much make calculations single-thread. You might consider putting only decision on which row to process inside the synchronized block:

int rowToProcess;
synchronized (this) {
    if (i < number of columns in final matrix){
        rowToProcess = i;
        i++;
        }
    else
        return;
    }
Timekiller
  • 2,946
  • 2
  • 16
  • 16
  • Would you be assigning i sequentially (if so, where would you increment it?). I am going to try this in my run method and let you know how it works. Thanks. – whistler May 05 '12 at 06:25
  • Sorry, I messed up there a bit. You should increment i right after you assign it to "rowToProcess" - fixed my answer. Also make sure that "rowToProcess" variable isn't shared among threads. – Timekiller May 05 '12 at 06:32