I'm trying to improve the performance of my code, but when I added new threads, my performance had dropped.
First version:
public int[][] calculate(int[][] matriz1, int [][] matriz2, int matrixSize) {
int[][] matrix = new int[matrixSize][matrixSize];
for(int i = 0; i < matrixSize; i++){
for(int k = 0; k < matrixSize; k++){
for(int j = 0; j < matrixSize; j++){
matrix[i][j] = matrix[i][j] + matriz1[i][k] * matriz2[k][j];
}
}
}
return matrix;
}
Second version:
public int[][] calculate(int[][] matriz1, int[][] matriz2, int matrixSize) {
final int[][] matrix = new int[matrixSize][matrixSize];
CountDownLatch latchA = new CountDownLatch((int) (Math.pow(matrixSize, 3)));
List<Thread> threads = new ArrayList<>();
for (int i = 0; i < matrixSize; i++) {
finalI = i;
Thread thread1 = new Thread(() -> {
for (int k = 0; k < matrixSize; k++) {
for (int j = 0; j < matrixSize; j++) {
matrix[finalI][j] = matrix[finalI][j] + matriz1[finalI][k] * matriz2[k][j];
latchA.countDown();
}
}
});
thread1.start();
threads.add(thread1);
if (threads.size() % 100 == 0) {
waitForThreads(threads);
}
}
try {
latchA.await();
} catch (InterruptedException e) {
e.printStackTrace();
}
return matrix;
}
private void waitForThreads(List<Thread> threads) {
for (Thread thread : threads) {
try {
thread.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
threads.clear();
}
I tried to create a new class that implements Runnable
interface for multi-threading, but the performance dropped down even further.
The result time of 2
algorithms are:
First version: 0.0094
Second version: 1.5917
I'm studying about how to leverage CPU cache memory, and the first algorithm has had the best performance of all.
The repository is https://github.com/Borges360/matrix-multiplication.
In C, adding loop threads improves performance a lot.
The explanation in C: https://www.youtube.com/watch?v=o7h_sYMk_oc