I'm trying to find the most efficient method for multiplying a Matrix by its transpose. Any ideas on the most efficient mathematical formula?
-
With modern compilers, your versions are equivalent. But you should use the ikj ordering to improve cache use. It may make a *big* difference. See for instance this question https://stackoverflow.com/questions/9936132/why-does-the-order-of-the-loops-affect-performance-when-iterating-over-a-2d-arra – Alain Merigot May 30 '19 at 07:48
-
1If you multiply a matrix by its inverse you get the identity matrix, so there shouldn't be any reason to multiply at all. – Brendan May 30 '19 at 07:52
-
This question is not for this forum, please post on : https://codereview.stackexchange.com/ – Vagish May 30 '19 at 09:20
-
1@Brendan The OP may have written "inverse", but I guess they meant *transposed* – Bob__ May 30 '19 at 09:32
-
Honestly, if this is production quality at all, use a library that will do this better. – Michael Dorgan May 31 '19 at 00:02
-
I would write n * n only once: unsigned int nn = n * n; ..., And I would definitely use const keyword if n is const. – alirakiyan May 02 '21 at 10:43
2 Answers
You can improve your first solution, by initializing A
and B
matrix in the same for
loop.
n * k
is also calculated two times, you can store it into a variable to save some time.
It is better to use B[i + n * j] += ...
instead of B[i + n * j] = B[i + n * j] + ...
, because in the first one, B[i + n * j]
is read once, when at the second on it is read twice.
void print_unmodified()
{
unsigned int n = 64u;
unsigned int A[N_MAX];
unsigned int B[N_MAX];
/* Initializing the A and B matrix with values 1->64^2 */
for (int i = 0; i < (n * n); i++)
{
A[i] = i + 1;
B[i] = 0u;
}
/* Matrix Multiplication B = A*A' */
for (int i = 0; i < n; i++)
{
for (int j = 0; j < n; j++)
{
int index = i + n*j;
for (int k = 0; k < n; k++)
{
int p = n * k;
B[index] += A[i + p] * A[j + p];
}
}
}
}

- 351
- 3
- 11
-
-
You are correct, both are O(n^3). I misread the incremental part. – Norbert Incze May 30 '19 at 09:20
-
Since B is full of 0's he could initialise it with calloc or memset if he's under linux – Gerardo Zinno May 30 '19 at 17:06
The posted code gives the correct answer, a matrix B which is the product of another matrix A by its transposed, but both matrices are stored in a couple of arrays in column-major order (or at least, that's what what the OP states in the comments).
Considering this snippet:
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
for (int k = 0; k < n; k++) {
B[i + n * j] = B[i + n * j] + A[i + n * k] * A[j + n * k];
// ^^^^^^^^^ ^^^^^^^^^
}
}
}
We can notice that the inner loop traverses those arrays skipping n
elements every time and that's not cache-friendly.
If we arrange the order of the loops so that the inner one iterates over contiguous elements, we could have a performance gain:
// Calculate the matrix B = A times A transposed. A is an n x n matrix.
// A and B are stored in plain arrays ('src' and 'dest') in column-major order
void mult_AAT(size_t n, value_t const *src, value_t *dest)
{
for (size_t i = 0, end = n * n; i < end; i++)
{
dest[i] = 0;
}
for (size_t k = 0; k < n; k++) // <--
{
for (size_t j = 0; j < n; j++) // <-
{
for (size_t i = 0; i < n; i++)
{
dest[j * n + i] += src[k * n + i] * src[k * n + j];
// ^^^^^^^^^ ^^^^^^^^^ ^^^^^^^^^
}
}
}
}
Live HERE.

- 12,361
- 3
- 28
- 42
-
-
@SamTalbot Thanks for having edited your question, but there are some other info that I think are missing to obtain a proper answer. You may have noted, in the link I posted, that I stored the matrices *row-wise* in the arrays (e.g. look at the output of the first 2x2 matrix). Is that what you need or do you have to store them *column-wise*? In the same first example, my resulting B matrix is different from your algorithm's, which one is the "correct" one? The logic behind that function is that the j column of the transposed of A is just the j row of A and iterating that way is cache-friendly. – Bob__ May 31 '19 at 07:54
-
@SamTalbot Well, that wasn't a minor detail ;). Answer changed accordingly. – Bob__ Jun 01 '19 at 12:52