new = zeros(TR);
for i1 = 1 : TR
for i2 = 1 : TR
for j1 = 1 : TR
for j2 = 1 : TR
new(i1, i2) = new(i1, i2) + p1(i1, j1) * p2(i2, j2) * ...
self.radon(ind - 1, i1, i2, j1, j2) * ...
current(j1, j2);
end
end
end
end
Here size of "new", p1, p2, current is TRxTR, and radon is N_timesteps x TR x TR x TR x TR.
Essentially what I am doing here is for a fixed time step "ind" computing expectation on the previous time step "ind -1" using transition probability p1 * p2 * radon.
Thank you!
Update:
I was able to half-vectorize it as follows:
for i1 = 1 : TR
for i2 = 1 : TR
new(i1, i2) = sum(sum((p1(i1, :)' * p2(i2, :)) .* ...
squeeze(self.radon(ind - 1, i1, i2, :, :)) .* current));
end
end
So now it runs in 16 seconds instead of 2min. Can anybody suggest any more improvements? Thanks!