0

My code as follows, and in the main function, I recall Mat_product function about 223440 times, use 179ns, 23% in the whole runtime.

struct Matrix_SE3 {
    float R[3][3];
    double P[3];  //here i need use double type.
};

struct Matrix_SE3 Mat_product(struct Matrix_SE3 A, struct Matrix_SE3 B) {
    struct Matrix_SE3 result = { { { 0, 0, 0 }, { 0, 0, 0 }, { 0, 0, 0 } }, { 0,
            0, 0 } };
    for (int i = 0; i < 3; i++) {
        result.P[i] += A.P[i];
        for (int j = 0; j < 3; j++) {
            result.P[i] += A.R[i][j] * B.P[j];
            for (int k = 0; k < 3; k++)
                result.R[i][j] += A.R[i][k] * B.R[k][j];
        }
    }
    return result;
}

where $R$ is the rotation matrix, and $P$ represent the position, the function is calculated at two special euclidean group $SE(3)$ matrix multiplication and return $SE(3)$ matrix.

Maybe this is a duplicate of Optimized matrix multiplication in C, the difference is my code use struct to describe matrix, does it affect the efficiency of calculation?

lumw
  • 297
  • 1
  • 9
  • 1
    If you google it you'll find a lot of SIMD implementations of matrix multiplication. – tkausl Jan 22 '19 at 09:28
  • What is SE(3)? The special euclidean group? – lisyarus Jan 22 '19 at 09:28
  • There are algorithms of escalating complexity. Currently, Williams gets it down to O(n^2.373) as described here: https://people.csail.mit.edu/virgi/matrixmult-f.pdf I doubt you are expected to implement something like that though, so depending on your requirements, you should look up a reasonably simple matrix multiplication algorithm with a runtime that's good enough for your task, or use a matrix library. – Blaze Jan 22 '19 at 09:32
  • Possible duplicate of [Optimized matrix multiplication in C](https://stackoverflow.com/questions/1907557/optimized-matrix-multiplication-in-c) – Duck Dodgers Jan 22 '19 at 09:33
  • Are you aware of the algorithm by [Strassen](https://en.wikipedia.org/wiki/Strassen_algorithm)? – Codor Jan 22 '19 at 09:53
  • Your matrices are small, but passed by value many times. Have you tried passing those structs by pointers? – Bob__ Jan 22 '19 at 14:46

1 Answers1

1

Not sure what are the P and R par in your code, but you should never use the ijk ordering for matrix multiplication.

Because of the row-major ordering, when accessing B.R[k][j] in your inner loop, many accesses will lead to a cache miss, reducing performances significantly, even with your small matrices.

The proper way to perform matrix multiplication is to iterate in the ikj order.

for (int i = 0; i < 3; i++) {
    double r;
    result.P[i] += A.P[i];
    for (int k = 0; k < 3; k++) {
        r=A.R[i][k];
        for (int j = 0; j < 3; j++) {
            result.P[i] += A.R[i][j] * B.P[j];  
            result.R[i][j] += r * B.R[k][j];
        }
    }
}

All accesses will properly be performed in row major order order and will benefit from the cache behavior.

And do not forget to use -O3 optimization. Most compilers will use sse/avx instructions to optimize the code.

Alain Merigot
  • 10,667
  • 3
  • 18
  • 31
  • Well, this is true in general, but OP's struct has 9 floats and 3 doubles, I doubt they couldn't fit in a modern day cache. – Bob__ Jan 22 '19 at 14:55
  • well,, 'result.P[i] += A.R[i][j] * B.P[j];' cannot in third loop. but, -O3 is useful. thank you! – lumw Jan 23 '19 at 08:56