How do I vectorize this?

Question

new = zeros(TR);
for i1 = 1 : TR
    for i2 = 1 : TR
        for j1 = 1 : TR
            for j2 = 1 : TR
                  new(i1, i2) = new(i1, i2) + p1(i1, j1) * p2(i2, j2) * ...
                                self.radon(ind - 1, i1, i2, j1, j2) * ...
                                current(j1, j2);
            end
        end
   end
end

Here size of "new", p1, p2, current is TRxTR, and radon is N_timesteps x TR x TR x TR x TR.

Essentially what I am doing here is for a fixed time step "ind" computing expectation on the previous time step "ind -1" using transition probability p1 * p2 * radon.

Thank you!

Update:

I was able to half-vectorize it as follows:

for i1 = 1 : TR
    for i2 = 1 : TR
        new(i1, i2) = sum(sum((p1(i1, :)' * p2(i2, :)) .* ...           
                      squeeze(self.radon(ind - 1, i1, i2, :, :)) .* current));
    end
end

So now it runs in 16 seconds instead of 2min. Can anybody suggest any more improvements? Thanks!

`self.radon` <- are you using matlab oop? Is this part of a class method? — Daniel, Nov 27 '15 at 23:48
yes, self.radon is just a class attribute. I will remove self to reduce confusion. — user3878681, Nov 27 '15 at 23:49
Leave it in the code. I thing that is the point where you are losing most of the performance! — Daniel, Nov 27 '15 at 23:50
how so? do you mean matlab JIT compiler cannot optimize this? — user3878681, Nov 27 '15 at 23:52

score 0 · Answer 1 · edited May 23 '17 at 10:28

0

I will interpret "how to vectorize this" as "how to make it faster", answering at least this. Unless you use matlab 2015b or newer, accessing class attributes is slow. Performance measurements are available here.

You may use a temporary variable to increase the performance:

radonslice=self.radon(ind - 1, :, :, :, :)

Then use radonslice(1, i1, i2, j1, j2) in your code.

edited May 23 '17 at 10:28

Community

1
1

answered Nov 27 '15 at 23:59

Daniel

36,610
3
36
69

Hm, your link talks about class methods, not attributes. I tried with radodslice, and it is actually slower: the whole simulation runs in 6min instead of 2min. Both times 90+% of the time is spent in these nested loops. – user3878681 Nov 28 '15 at 00:18
Sorry. Actually, there is a slight speedup with this change. I measured it wrong :) – user3878681 Nov 28 '15 at 00:25
The question is about methods, but the benchmarks contain properies as well, search for `classdef property:` – Daniel Nov 28 '15 at 00:36
any way to speedup "squeeze"? Haha, now matlab profiler says that most of the time is spent there... – user3878681 Nov 28 '15 at 00:38
You could try to `permute` the matrix once outside the loop. `r2=permute(radon,[4,5,1,2,3])` to swap dimensions, then `r2(:,:,ind - 1, i1, i2)` – Daniel Nov 28 '15 at 00:42

How do I vectorize this?

1 Answers1