2

Can someone help me with this, please? I'm trying to do a matrix multiplication using parallel programming - Java. This is what I've tried so far

public class MatrixParallel22 extends Thread {
    final static int noThreads = 2;

    public static void main(String args[]) throws Exception {
        //get the start time
        long startTime = System.currentTimeMillis();

        MatrixParallel[] threads = new MatrixParallel[noThreads];
        for (int me = 0; me < noThreads; me++) {
            threads[me] = new MatrixParallel(me);
            threads[me].start();
        }

        for (int me = 0; me < noThreads; me++) {
            threads[me].join();
        }

        long endTime = System.currentTimeMillis();

        System.out.println("Calculation completed in " +
                (endTime - startTime) + " milliseconds");
    }

    int me;

    public MatrixParallel(int me) {
        this.me = me;
    }

    public void run() {
        //generate two matrices using random numbers
        int matrix1[][] = matrixGenerator();
        int matrix2[][] = matrixGenerator();

        //get the number of rows from the first matrix
        int m1rows = matrix1.length;
        //get the number of columns from the first matrix
        int m1cols = matrix1[0].length;
        //get the number of columns from the second matrix
        int m2cols = matrix2[0].length;

        //multiply the matrices and put the result in an array
        int[][] result = new int[m1rows][m2cols];
        for (int i = 0; i < m1rows; i++) {
            for (int j = 0; j < m2cols; j++) {
                for (int k = 0; k < m1cols; k++) {
                    result[i][j] += matrix1[i][k] * matrix2[k][j];
                }
            }
        }
    }

    public static int[][] matrixGenerator() {
        //create an array
        int matrix[][] = new int[550][550];
        //create a random generator
        //and fill it with random numbers
        Random r = new Random();
        for (int i = 0; i < matrix.length; i++) {
            for (int j = 0; j < matrix[i].length; j++) {
                matrix[i][j] = r.nextInt(10000);
            }
        }
        return matrix;
    }
}

In this case, I'm trying to set up the number of threads in a variable and then to measure and see how fast the program performs if I increase/decrease the number of threads.

//update -- When I execute the code.. it works fine. But the thing is that if I increase the number of threads, the execution time is slower. For instance, with 2 threads, i get 316 milliseconds and with 8 threads I get 755 milliseconds I don't know which part is wrong. Is it the way I execute the threads?

Community
  • 1
  • 1
Selyst
  • 111
  • 1
  • 12
  • 1
    Just a friendly tip, you may want to read over this page: [The How-To-Ask Guide](https://stackoverflow.com/help/how-to-ask) so you can always be sure that your questions are easily answerable and as clear as possible. Be sure to include any efforts you've made to fix the problem you're having, and what happened when you attempted those fixes. Also don't forget to your show code and any error messages! – Matt C Apr 27 '16 at 19:15
  • We aren't sure what you need help with. Do you want a review of your code? Or, are you having some errors? Include these details in your question please. – Matt C Apr 27 '16 at 19:15
  • Sorry. My bad. I've added more info. I'm just confused about the timings. I thought that increasing the number of threads would speed up the execution. But it actually makes it slower – Selyst Apr 27 '16 at 19:20
  • 1
    Adding more threads than your processor has cores makes it slower because there is significant overhead to create and switch between the threads. If you have as many threads as cores (two threads if you have a dual-core processor), then once the threads are created, they can run to completion on separate cores. But once you add more, they have to share time on the available cores. – jonhopkins Apr 27 '16 at 19:24
  • Oh. Well.. I have 4 cores and 8 logical processors on my laptop. – Selyst Apr 27 '16 at 19:27
  • In that case I don't know why it's slowing down so much. However, I wouldn't expect an increase in speed, since each thread is doing the same amount of work no matter what. By the title I was assuming each thread was handling a distinct section of the multiplication, but it looks like every thread gets two brand new matrices to multiply together. – jonhopkins Apr 27 '16 at 19:40
  • That's actually what I'm trying to achieve, @ jonhopkins. But I guess I'm using the wrong approach – Selyst Apr 27 '16 at 19:41
  • Which are you trying to achieve? Each thread getting its own matrices, or a single pair of matrices that the threads work together to multiply? – jonhopkins Apr 27 '16 at 19:50
  • The second one. A single pair of matrices that the threads work together to multiply – Selyst Apr 27 '16 at 19:51
  • A google search for "parallel matrix multiplication" will get you many in-depth articles on how to do it, but the general idea is to tell each thread to work on a specific quadrant, for example with 100x100 matrices, thread1 gets the top left corner from 1,1 to 50,50; thread2 gets the top right corner from 51,1 to 100,50; thread3 gets 1,51 to 50,100; and thread4 gets 51,51 to 100,100. Then they're all working on a part that none of the others will touch, so they can safely work in parallel without worrying about overwriting each other's work. The trick is just knowing how to divide it up. – jonhopkins Apr 27 '16 at 19:58
  • Yes. I know what you mean. I'm just trying to find a way to implement this with code. I just don't know how to split this into different parts - as you said – Selyst Apr 27 '16 at 20:04
  • You're partway there. You're already telling each thread which number it is by passing in the `me` variable. Assuming you know exactly how many threads you will be using, you can use that variable to calculate which quadrant that particular thread should calculate. As for having the matrices accessible by the main thread and each of the worker threads, that is a different topic and is probably better suited for a separate question if you can't find anything that helps. – jonhopkins Apr 27 '16 at 20:09
  • Alright. Well. I'll try this now. I will let you know if I managed to get any results. Thank you! – Selyst Apr 27 '16 at 20:38

1 Answers1

2

It's entirely expected that your program doesn't run any faster when you use more threads, because you create new matrices for multiplication in each separate worker. You are not splitting your task in multiple pieces that can be processed by different worker threads in parallel.

  • This by itself shouldn't result in the further performance degradation that you are seeing, so most probably you are experiencing some of the potential problems related to using too many threads like significant contention, cache thrashing and others.

As for speeding up a simple matrix multiplication implementation in Java, there are many similar questions with very nice answers (like Peter Lawrey's answer here).

If you actually are using matrices that big (n > 500), you may also try Strassen's algorithm. Of course, any of the specialized tools for matrix multiplication will blow a simple Java implementation out of the water, but I'm assuming that you're doing this partly for fun and partly as an exercise.

Edit:

It looks like you are unsure how to split and (try to) parallelize the task. The most obvious divide and conquer strategy requires splitting each matrix into 4 quadrants and translating 1 multiplication of n-dimension matrices into 8 multiplications of n/2-dimension matrices, roughly like that:

| A11 | A12 |   | B11 | B12 |   | A11*B11 | A11*B12 |   | A12*B21 | A12*B22 |
|-----+-----| x |-----+-----| = |---------+---------| + |---------+---------|
| A21 | A22 |   | B21 | B21 |   | A21*B11 | A21*B21 |   | A22*B21 | A22*B22 |

Strassen's algorithm can reduce this to 7 multiplications, using very precisely chosen manipulation of the Aij and Bij matrices.

You can also compare the speedup you get from using multiple threads with the speedup from a carefully chosen array iteration order (see Wayne and Sedgewick's illustration here).

Community
  • 1
  • 1
Dimitar Dimitrov
  • 16,032
  • 5
  • 53
  • 55
  • Well... I do it more for a coursework. At first, I've created the program without trying to get a parallel speedup. Now I need to do it with threads as well. The size of the matrix doesn't matter. Mine is >500 because I can see better the timings. Is there any way to split the same task into multiple parts? I'm trying to look also on what you gave me. But I'm not sure if I can adapt it – Selyst Apr 27 '16 at 19:40
  • I will try this now. And I'll get back as soon as I get some results. Meanwhile.. Thank you! – Selyst Apr 27 '16 at 20:37