1

I'm working on a program that requires calculating the inverse of an 8x8 matrix as fast as possible. Here's the code I wrote:

class matrix
{
public:
    int w, h;
    std::vector<std::vector<float>> cell;

    matrix(int width, int height)
    {
        w = width;
        h = height;
        cell.resize(width);
        for (int i = 0; i < cell.size(); i++)
        {
            cell[i].resize(height);
        }
    }
};

matrix transponseMatrix(matrix M)
{
    matrix A(M.h, M.w);

    for (int i = 0; i < M.h; i++)
    {
        for (int j = 0; j < M.w; j++)
        {
            A.cell[i][j] = M.cell[j][i];
        }
    }

    return A;
}

float getMatrixDeterminant(matrix M)
{
    if (M.w != M.h)
    {
        std::cout << "ERROR! Matrix isn't of nXn type.\n";
        return NULL;
    }
    float determinante = 0;
    if (M.w == 1)
    {
        determinante = M.cell[0][0];
    }
    if (M.w == 2)
    {
        determinante = M.cell[0][0] * M.cell[1][1] - M.cell[1][0] * M.cell[0][1];
    }
    else
    {
        for (int i = 0; i < M.w; i++)
        {
            matrix A(M.w - 1, M.h - 1);
            int cy = 0;
            for (int y = 1; y < M.h; y++)
            {
                int cx = 0;
                for (int x = 0; x < M.w; x++)
                {
                    if (x != i)
                    {
                        A.cell[cx][cy] = M.cell[x][y];
                        cx++;
                    }
                }
                cy++;
            }

            determinante += M.cell[i][0] * pow(-1, i + 0) * getMatrixDeterminant(A);
        }
    }

    return determinante;
}

float getComplementOf(matrix M, int X, int Y)
{
    float det;

    if (M.w != M.h)
    {
        std::cout << "ERROR! Matrix isn't of nXn type.\n";
        return NULL;
    }

    if (M.w == 2)
    {
        det = M.cell[1 - X][1 - Y];
    }
    else
    {
        matrix A(M.w - 1, M.h - 1);
        int cy = 0;
        for (int y = 0; y < M.h; y++)
        {
            if (y != Y)
            {
                int cx = 0;
                for (int x = 0; x < M.w; x++)
                {
                    if (x != X)
                    {
                        A.cell[cx][cy] = M.cell[x][y];
                        cx++;
                    }
                }
                cy++;
            }
        }
        det = getMatrixDeterminant(A);
    }

    return (pow(-1, X + Y) * det);
}

matrix invertMatrix(matrix M)
{
    matrix A(M.w, M.h);
    float det = getMatrixDeterminant(M);
    if (det == 0)
    {
        std::cout << "ERROR! Matrix inversion impossible (determinant is equal to 0).\n";
        return A;
    }

    for (int i = 0; i < M.h; i++)
    {
        for (int j = 0; j < M.w; j++)
        {
            A.cell[j][i] = getComplementOf(M, j, i) / det;
        }
    }

    A = transponseMatrix(A);

    return A;
}

While it does work, it does so way too slowly for my purposes, managing to calculate an 8x8 matrix's inverse about 6 times per second.

I've tried searching for more efficient ways to invert a matrix but was unsuccessfull in finding solutions for matrices of these dimensions. However I did find conversations in which people claimed that for matrices below 50x50 or even 1000x1000 time shouldn't be a problem, so I was wondering if I have missed something, either a faster method or some unnecessary calculations in my code.

Does anyone have experience regarding this and/or advice?

Sorry for broken english.

me myself
  • 23
  • 4
  • 3
    Do you know what references are, and how to use them? Are you familiar with how in C++ function parameters get passed by value by default, and this effectively makes a useless copy of all these matrix parameters, and how to properly use references to avoid this? – Sam Varshavchik Aug 30 '22 at 23:35
  • Is the goal to optimize this code, or to get maximum performance? Remember there are [existing libraries](https://github.com/glampert/vectormath) that use [SIMD](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data). – tadman Aug 30 '22 at 23:44
  • 3
    If you are implementing such functions on your own, have a look at the different methods that can be used and how simple they can be implemented: [link](https://en.wikipedia.org/wiki/Invertible_matrix#Methods_of_matrix_inversion) – ZwergofPhoenix Aug 30 '22 at 23:47
  • If `Matrix::cell` is a pointer to an array of pointers to an array or a `std::vector` of `std::vector`s, the lack of spatial locality may already be killing you. – user4581301 Aug 30 '22 at 23:47
  • 2
    The way you're calculating a determinant is very inefficient. – Evg Aug 30 '22 at 23:47
  • 2
    Pretty sure game/graphic development does matrix inverse. I would search on how they usually handle it, as you’re not like to find a much faster way than what they do. – Taekahn Aug 30 '22 at 23:48
  • 2
    `pow(-1, i + 0)` has the potential to be very expensive. `pow` on integers is generally a bad idea, but If you're just toggling the sign, there are quicker ways to do this, including a dumb tracking variable you multiply by -1 on each iteration. Mind you, the compiler could know these tricks as well and optimize the call to `pow` away. – user4581301 Aug 30 '22 at 23:54
  • 1
    Picking a sensible algorithm is critical if you care about performance. Techniques for matrix inversion that calculate determinant often tend to be numerically slower than alternatives. And your calculation of the determinant is not particularly efficient. Try using a general approach like Gauss-Jordan elimination. If you know something specific about your matrix (e.g. it has some structure that can be exploited) there may be other algorithms that are more efficient (albeit, often, with some trade-off of precision). – Peter Aug 31 '22 at 00:02
  • What is `matrix` in your code? How have you defined it? – Bob__ Aug 31 '22 at 00:11
  • Do you need to write your own? Can you use `Eigen`? – Fantastic Mr Fox Aug 31 '22 at 00:22
  • Yes, just about any other method would be better. – n. m. could be an AI Aug 31 '22 at 03:45
  • By the way, do you really need the inverse of the matrix, or are you only going to use this inverse to multiply it with a few vectors? – Marc Glisse Aug 31 '22 at 09:10
  • @Bob__ I updated the question with that part of the code – me myself Aug 31 '22 at 10:27
  • @tadman I'd rather optimize the code, that way I know what's going on and can learn new stuff – me myself Aug 31 '22 at 10:30
  • @user4581301 And which would be a way to avoid that? – me myself Aug 31 '22 at 10:31
  • @Taekahn Ok, I'll try to research that, thank you – me myself Aug 31 '22 at 10:32
  • @user4581301 Thank you for making me notice, I did know that pow tends to be inefficient but I forgot I had used it – me myself Aug 31 '22 at 10:34
  • @FantasticMrFox I'd rather write my own so I know what's going on in the code and that way I can also learn somwthing new, although if I won't be able to fix it myself I'll try looking into `Eigen` so thank you – me myself Aug 31 '22 at 10:37
  • @Peter I will try looking into that approach, I had never heard of it, thanks – me myself Aug 31 '22 at 10:39
  • @MarcGlisse I do need the inverse matrices that I calculate to multiply them with a few vectors, do I not need to calculate the entire matrix to do so? – me myself Aug 31 '22 at 10:42
  • https://gregorygundersen.com/blog/2020/12/09/matrix-inversion/ https://stackoverflow.com/questions/31256252/why-does-numpy-linalg-solve-offer-more-precise-matrix-inversions-than-numpy-li etc – Marc Glisse Aug 31 '22 at 10:52
  • [Here is a simple, low-impact way around storing a matrix as pointers to pointers](https://stackoverflow.com/a/36123944/4581301). – user4581301 Aug 31 '22 at 15:17
  • Until you're using SIMD instructions, this will never perform as well as it could. You're missing out on huge gains. – tadman Aug 31 '22 at 20:41

1 Answers1

3

Your implementation have problems as others commented on the question. The largest bottleneck is the algorithm itself, calculating tons of determinants.(It's O(n!)!)

If you want a simple implementation, just implement Gaussian elimination. See finding the inverse of a matrix and the pseudo code at Wikipedia. It'll perform fast enough for small sizes such as 8x8.

If you want a complex but more efficient implementation, use a library that is optimized for LU decomposition(Gaussian elimination), QR decomposition, etc.(Such as LAPACK or OpenCV.)

relent95
  • 3,703
  • 1
  • 14
  • 17