I am working with a lower triangular matrix, the function below calculates a row index of such matrix. How can I optimize it in terms of execution time?
The triangular matrix can hold at most N (N + 1) / 2
nonzero elements (where N
is the matrix dimension - N x N
).
I have a set of numbers 0, 1, 2, ..., N (N + 1) / 2 - 1
and from those, I have to calculate the matrix row index.
My current solution:
inline
unsigned int calc_row(unsigned int i)
{
return static_cast<unsigned int>(floor(sqrt(static_cast<double>(0.25 + 2 * i)) - 0.5));
}
// example:
calc_row(0) == 0;
calc_row(1) == 1;
calc_row(2) == 1;
calc_row(3) == 2;
calc_row(4) == 2;
calc_row(5) == 2;
Question:
1) Do you think my current solution is performance friendly?
2) If not how can I optimize it (in terms of the function execution time)?
If You believe an alternate method to calculate the row index would perform better, I am fine with it. Unfortunately the lookup table
is not an option in my case.
EDIT #1:
I just had an idea: Is there a way to make a template metaprogramming version of the lookup table
? A way to generate the row number at a compile time could prove to be a significant optimization. The biggest unsigned int i
would be around 10 million in my case.
EDIT #2:
I edited the entire question because it caused a major confusion. I am sorry about that.
EDIT #3:
calc_row()
calculates the formula: (sqrt(1 + 8 * i) - 1) / 2
which is the solution for the quadratic equation x(x + 1) / 2 = i
. Where i
is row index.
The main idea for this solution lies in the fact that the linear index for a triangular matrix with diagonal can be calculated as: i (i + 1) / 2 + j
. Where i
is row index and j
is column index.