4

I'm quite new to Matlab and I need help in speeding up some part of my code. I am writing a Matlab application that performs 3D matrix convolution but unlike in standard convolution, the kernel is not constant, it needs to be calculated for each pixel of an image.

So far, I have ended up with a working code, but incredibly slow:

function result = calculateFilteredImages(images, T)

% images - matrix [480,360,10] of 10 grayscale images of height=480 and width=360
% reprezented as a value in a range [0..1] 
% i.e. images(10,20,5) = 0.1231;

% T - some matrix [480,360,10, 3,3] of double values, calculated earlier 

    kerN = 5;               %kernel size
    mid=floor(kerN/2);      %half the kernel size
    offset=mid+1;           %kernel offset
    [h,w,n] = size(images);
    %add padding so as not to get IndexOutOfBoundsEx during summation: 
    %[i.e. changes [1 2 3...10] to [0 0 1 2 ... 10 0 0]]
    images = padarray(images,[mid, mid, mid]);

    result(h,w,n)=0;           %preallocate, faster than zeros(h,w,n)
    kernel(kerN,kerN,kerN)=0;  %preallocate

    % the three parameters below are not important in this problem 
    % (are used to calculate sigma in x,y,z direction inside the loop) 
    sigMin=0.5;
    sigMax=3;
    d = 3;

    for a=1:n;
        tic;
        for b=1:w;
            for c=1:h;
                M(:,:)=T(c,b,a,:,:); % M is now a 3x3 matrix
                [R D] = eig(M); %get eigenvectors and eigenvalues - R and D are now 3x3 matrices     

                % eigenvalues
                l1 = D(1,1);
                l2 = D(2,2);
                l3 = D(3,3);

                sig1=sig( l1 , sigMin, sigMax, d);
                sig2=sig( l2 , sigMin, sigMax, d);
                sig3=sig( l3 , sigMin, sigMax, d);

                % calculate kernel
                for i=-mid:mid
                    for j=-mid:mid
                        for k=-mid:mid
                            x_new = [i,j,k] * R; %calculate new [i,j,k]
                            kernel(offset+i, offset+j, offset+k) = exp(- (((x_new(1))^2 )/(sig1^2) + ((x_new(2))^2)/(sig2^2) + ((x_new(3))^2)/(sig3^2)) /2);
                        end
                    end
                end
                % normalize
                kernel=kernel/sum(kernel(:));

                %perform summation
                xm_sum=0;
                for i=-mid:mid
                    for j=-mid:mid
                        for k=-mid:mid
                            xm_sum = xm_sum + kernel(offset+i, offset+j, offset+k) * images(c+mid+i, b+mid+j, a+mid+k);
                        end
                    end
                end
                result(c,b,a)=xm_sum;

            end
        end
        toc;
    end
end

I tried replacing the "calculating kernel" part with

sigma=[sig1 sig2 sig3]
[x,y,z] = ndgrid(-mid:mid,-mid:mid,-mid:mid);
k2 = arrayfun(@(x, y, z) exp(-(norm([x,y,z]*R./sigma)^2)/2), x,y,z);

but it turned out to be even slower than the loop. I went through several articles and tutorials on vectorization but I'm quite stuck with this one. Can it be vectorized or somehow speeded up using something else? I'm new to Matlab, maybe there are some build-in functions that could help in this case?

Update The profiling result: enter image description here

Sample data which was used during profiling:
T.mat
grayImages.mat

Community
  • 1
  • 1
  • 1
    If vectorized function such as `arrayfun` are not working, you might want to consider writing a MEX-file. – Autonomous Feb 16 '14 at 01:10
  • 3
    Besides some rare exceptions, `arrayfun` is no vectorisation. It's a iterative procedure. Arrayfun is often slower than the corresponding loop: http://stackoverflow.com/questions/12522888/arrayfun-can-be-significantly-slower-than-an-explicit-loop-in-matlab-why – Daniel Feb 16 '14 at 01:20
  • @user3299285: Please give some example input data. What is `T` and what is `images`? Images is a stack of grayscale images? `ker` is undefined. – Daniel Feb 16 '14 at 01:24
  • @Daniel You're right, I've read that post. I added some new comments to the code explaining what are images and T, T generally represents a 3x3 matrix for each pixel in images. Sorry about the `ker`, it should have been `kernel`, I fixed it as well... – user3299285 Feb 16 '14 at 10:13
  • 1
    This is quite a lot of code, can you share the profiler results so we at least can confirm which lines are slow? After doing that, provide some sample input for that piece of the code so the situation becomes reproducible. – Dennis Jaheruddin Feb 18 '14 at 09:34
  • @DennisJaheruddin I added the profiler result, as I expected, the most inneficient parts are the operations inside the loops. Thank you for your help. – user3299285 Mar 26 '14 at 21:37

1 Answers1

0

As Dennis noted, this is a lot of code, cutting it down to the minimum that's slow given by the profiler will help. I'm not sure if my code is equivalent to yours, can you try it and profile it? The 'trick' to Matlab vectorization is using .* and .^, which operate element-by-element instead of having to use loops. http://www.mathworks.com/help/matlab/ref/power.html

Take your rewritten part:

sigma=[sig1 sig2 sig3]
[x,y,z] = ndgrid(-mid:mid,-mid:mid,-mid:mid);
k2 = arrayfun(@(x, y, z) exp(-(norm([x,y,z]*R./sigma)^2)/2), x,y,z);

And just pick one sigma for now. Looping over 3 different sigmas isn't a performance problem if you can vectorize the underlying k2 formula.

EDIT: Changed the matrix_to_norm code to be x(:), and no commas. See Generate all possible combinations of the elements of some vectors (Cartesian product)

Then try:

% R & mid my test variables
R = [1 2 3; 4 5 6; 7 8 9];
mid = 5;
[x,y,z] = ndgrid(-mid:mid,-mid:mid,-mid:mid);
% meshgrid is also a possibility, check that you are getting the order you want
% Going to break the equation apart for now for clarity
% Matrix operation, should already be fast.
matrix_to_norm = [x(:) y(:) z(:)]*R/sig1
% Ditto
matrix_normed = norm(matrix_to_norm)
% Note the .^ - I believe you want element-by-element exponentiation, this will 
% vectorize it.
k2 = exp(-0.5*(matrix_normed.^2))
Community
  • 1
  • 1
schodge
  • 891
  • 2
  • 16
  • 29
  • This is something I am trying to achieve :) the problem is in in the part `matrix_to_norm = [x,y,z]*R`, which results in an error "Inputs must be 2-D, or at least one input must be scalar". Even when I change it to `matrix_to_norm = [x,y,z].*R`, I get 'Matrix dimensions must agree'... – user3299285 Mar 26 '14 at 21:33
  • I edited my code, see the new code, and the link to the other problem. `[x(:) y(:) z(:)]` should be the matrix of all possible (x,y,z) pairs, which is (2*mid+1) by 3, which multiplies properly to a 3x3 matrix. Is k2 supposed to be a scalar? norm() returns a scalar according to the documentation, so k2 isn't a vector/matrix. – schodge Mar 26 '14 at 23:31