Optimizing matrix calculations in for loops in Octave

Question

I imported code from Matlab to Octave and the speed of certain functions seems to have dropped. I looked into vectorization and could not come up with a solution with my limited knowledge. What i want to ask, is there a way to speed this up?

    n = 181;
    N = 250;
    for i=1:n    
        for j=1:n
            par=0;
            for k=1:N;
               par=par+log2(1+(10.^(matrix1(j,i,matrix2(j,i))./10)./(matrix3(j,i).*double1+double2)));
            end
            resultingMatrix(j,i)=2.^((1/N).*par)-1;
        end
    end

Where dimensions are:

    matrix1 = 181x181x2,  
    matrix2 = 181x181 -->  containing values either 1 or 2 only,  
    matrix3 = 181x181,  
    double1, double2 = just doubles

You're indexing `matrix1` with values contained in `matrix2`, that might be slighly tricky, otherwise this should be relatively straigt forward, as all functions involved support matrices as input. Make a toy example (2x2 matrices or so) and start from there. Second tip: use better variable names. Having both small `n` and capital `N` is way too obscure, same for `matrix1` and `matrix2`. `i` and `j` are [also built-in functions](https://stackoverflow.com/q/14790740/5211833) so you'll want to avoid those too. — Adriaan, Nov 22 '21 at 15:32
@Adriaan Indeed, i will keep the i and j especially in mind for the future. Fortunately, these variables' names are used for this post only, to keep things simple and non-confusing. — xloee_gpj, Nov 22 '21 at 15:43
Are you sure you didn't simplify the code too much? The code in the loop over `k` doesn't use `k`, so that whole loop is equivalent to `par = N * log2(...)`. But then you divide `par` by `N`. I don't see the point of that. Please post **meaningful** code. Also, post **complete** code, give us some examples for contents of `matrix1` and `matrix2`. See [mre]. — Cris Luengo, Nov 22 '21 at 18:05
Thats the way the code was written by the one who wrote it, i just renamed the variables, so nothing changes in its functionality. Unfortunately i don't think this post can become less specific as its a rather particular problem. Also the log2(...) always gives different results based on the i,j chosen by the matrices, so i dont think we can no N *log2 either. You can consider the values of anything other than matrix2 = random for this post, as it doesnt change its purpose. — xloee_gpj, Nov 22 '21 at 19:31

Cris Luengo · Accepted Answer · 2021-11-22T22:36:01.627

Here's my testing code, I've completed your code by making some random matrices:

n = 181;
N = 250;
matrix1 = rand(n,n,2);
matrix2 = randi(2,n,n);
matrix3 = rand(n,n);
double1 = 1;
double2 = 1;
tic
for i=1:n
   for j=1:n
      par=0;
      for k=1:N
         par=par+log2(1+(10.^(matrix1(j,i,matrix2(j,i))./10)./(matrix3(j,i).*double1+double2)));
      end
      resultingMatrix(j,i)=2.^((1/N).*par)-1;
   end
end
toc

Note that the code inside the loop over k doesn't use k. This makes the loop superfluous. We can easily remove it. The loop does the same computation 250 times, adds up the results, and divides by 250, yielding the value of one of the repeated computations.

Another important thing to do is preallocate resultingMatrix, to avoid it growing with every loop iteration.

This is the resulting code:

tic
resultingMatrix2 = zeros(n,n);
for i=1:n
   for j=1:n
      par=log2(1+(10.^(matrix1(j,i,matrix2(j,i))./10)./(matrix3(j,i).*double1+double2)));
      resultingMatrix2(j,i)=2.^par-1;
   end
end
toc
max(abs((resultingMatrix(:)-resultingMatrix2(:))./resultingMatrix(:)))

The last line computes the maximum relative difference. It is 9.9424e-15 in my version of Octave. It will differ depending on the version, the system, and more. This error is the floating-point rounding error. Note that the original code, adding the same value 250 times, and then dividing it by 250, will produce a larger rounding error than the modified code. For example,

x = pi;
t = 0;
for i = 1:N
   t = t + x;
end;
t = t / N;
t-x

gives -8.4377e-15, a similar rounding error to what we saw above.

The original code took 81.5 s, the modified code takes only 0.4 s. This is not a gain of vectorization, it is just a gain of preallocation and not needlessly repeating the same computation over and over again.

Next, we can remove the other two loops by vectorizing the operations. The difficult bit here is matrix1(j,i,matrix2(j,i)). We can produce each of the n*n linear indices with (1:n*n).' + (matrix2(:)-1)*(n*n). This is not trivial, I suggest you think about how this computation works. You need to know that linear indices count, starting at 1 for the top-left array element, first down, then right, then along the 3rd dimension. So 1:n*n is simply the linear indices for each of the elements of a 2D array, in order. To each of these we add n*n if we need to access the 2nd element along the 3rd dimension.

We now have the code

tic
index = reshape((1:n*n).' + (matrix2(:)-1)*(n*n), n, n);
par = log2(1+(10.^(matrix1(index)./10)./(matrix3.*double1+double2)));
resultingMatrix3 = 2.^par-1;
toc
max(abs((resultingMatrix(:)-resultingMatrix3(:))./resultingMatrix(:)))

This code produces the exact same result as my previous version, and runs in only 0.013 s, 30 times faster than the non-vectorized code, and 6000 times faster than the original code.

You are indeed right. I did not think of such a simple thing for the N and could not connect the dots you had given me. Thank you for the time you have given into it and also for the detailed explaination behind every action. It does indeed work as intented! — xloee_gpj, Nov 23 '21 at 02:29

Optimizing matrix calculations in for loops in Octave

1 Answers1