This is my code for multiplying matricies in parallel:
public void multiplyParallel() {
int numProcessors = Runtime.getRuntime().availableProcessors();
int step = (int)MATRIX_SIZE/numProcessors;
for (int i=0; i<numProcessors; i++) {
Runnable r = new MatrixMultiply(this.start, this.end);
new Thread(r).start();
this.start += step;
this.end += step;
}
this.start = 0;
this.end = 0;
}
@Override
public void run() {
for (int i=this.start; i<this.end; i++)
for (int j=this.start; j<this.end; j++)
for (int k=this.start; k<this.end; k++)
this.matrix3[i][j] = this.matrix1[i][k] * this.matrix2[k][j];
}
But when I run this code on a 1024x1024 matrix, it only runs for 2-3 ms, whereas a serial version runs for around 1 second. I should be expecting 1/(numProcessors) time for the parallel version at the best.
Is there anything I'm doing wrong? The run() method is being called the same number of times as there are processors on my machine.