optimization for ikj algorithm (matrix multiplication)

Question

Creating a custom matrix class I implemented multiplication with ikj algorithm and now I'm trying to optimize it. The problem is that the version that should work better is about 5 times slower and I can't understand why.

This is the Matrix class with the "basic" algorithm:

class Matrix {

    private double[][] m;           // matrix
    private int rows;
    private int cols;

   // other stuff...


    // does some checks and returns requested matrix value
    // I know this will slow down computation, but it's not the relevant part
    public double get(int row, int col) {
        if (row >= rows || col >= cols)
            throw new IndexOutOfBoundsException(); // to catch block
        else
            return m[startRow + row][startCol + col];
    }


    public Matrix multiply(Matrix other) {
        int n = rows;
        int m = cols;
        int p = other.cols;

        double[][] prod = new double[n][p];

        for (int i = 0; i < n; i++)
            for (int k = 0; k < m; k++)
                for (int j = 0; j < p; j++)
                    prod[i][j] += get(i,k) * other.get(k,j);

        return new Matrix(prod);
    }
}

And this is the modified algorithm:

public Matrix multiplyOpt(Matrix other) {
    int n = rows;
    int m = cols;
    int p = other.cols;

    double[][] prod = new double[n][p];

    for (int i = 0; i < n; i++) {
        for (int k = 0; k < m; k++) {
            double aik = get(i,k);
            for (int j = 0; j < p; j++) {
                prod[i][j] += aik * other.get(k,j);
            }
        }
    }

    return new Matrix(prod);
}

My though was that moving that get call outside the loop it will be called n x m times instead of n x m x p.

These are the results of a random matrix multiplication (exception is never thrown):

multiply time = 0.599s
multiplyOpt time = 3.041s

Why this change makes it slower and not faster?

EDIT

Timings are obtained with:

double[][] m1 = new double[1000][750];
double[][] m2 = new double[750][1250];

for (int i = 0; i < m1.length; i++)
    for (int j = 0; j < m1[0].length; j++)
        m1[i][j] = new Double(Math.random());

for (int i = 0; i < m2.length; i++)
    for (int j = 0; j < m2[0].length; j++)
        m2[i][j] = new Double(Math.random());

Matrix a = new Matrix(m1);
Matrix b = new Matrix(m2);

long start = System.currentTimeMillis();
Matrix c = a.multiply(b);
long stop = System.currentTimeMillis();
double time = (stop - start) / 1000.0;
System.out.println("multiply time = "+time);

start = System.currentTimeMillis();
c = a.multiplyOpt(b);
stop = System.currentTimeMillis();
time = (stop - start) / 1000.0;
System.out.println("multiplyOpt time = "+time);

http://stackoverflow.com/questions/20467117/for-matrix-operation-why-is-ikj-faster-than-ijk — Mattia Dinosaur, May 08 '17 at 00:41
I read that question but it's comparing ijk with ikj, and that's not my case. My optimization is the one that compiler is supposed to do in the accepted answer but it's not working, so I'm trying to understand why — Becks, May 08 '17 at 00:54
The short answer is the compiler is better at optimizing than most programmers. You'd have to take a look at the generated assembly to figure out exactly how the compile optimized the two versions to figure out how in this case. — twain249, May 08 '17 at 01:04

optimization for ikj algorithm (matrix multiplication)

0 Answers0