3

I am currently in the process of optimizing my code to make image processing more efficient. my first problem was due to the vision.VideoFileReader and step where it took a long time to open each frame. I speed up my code by compressing my grayscale image into 3 frames in 1 RGB frame. This way I could load 1 RGB frame using vid.step() and have 3 frames imported ready for processing.

Now my code is running slow on the Laplacian of Gaussian (LoG) filtering. I read that using the function imfilter can be used to perform a LoG but it appears to be the next rate limiting step.

Upon further reading, it appears that imfilter is not the best option for speed. Apperently MATLAB introduced a LoG function but it was introduced in R2016b and I'm unfortunately using R2016a.

Is there a way to speed up imfilter or is there a better function to use to perform a LoG filtering?

Should I call python to speed up the process?

enter image description here

Code:

Hei = gh.Video.reader.info.VideoSize(2);
Wid = gh.Video.reader.info.VideoSize(1);

Log_filter = fspecial('log', filterdot, thresh); % fspecial creat predefined filter.Return a filter.
    % 25X25 Gaussian filter with SD =25 is created.

tic
ii = 1;

bkgd = zeros(Hei,Wid,3);
bkgd(:,:,1) = gh.Bkgd;
bkgd(:,:,2) = gh.Bkgd;
bkgd(:,:,3) = gh.Bkgd;

bkgdmod = reshape(bkgd,720,[]);

while ~isDone(gh.Video.reader)
    frame = gh.readFrame();
    img_temp = double(frame);

    img_temp2 = reshape(img_temp,720,[]);
    subbk = img_temp2 - bkgdmod;

    img_LOG = imfilter(subbk, Log_filter, 'symmetric', 'conv');

    img_LOG =  imbinarize(img_LOG,.002);
    [~, centroids, ~] = gh.Video.blobAnalyser.step(img_LOG);

    toc
end
hippietrail
  • 15,848
  • 18
  • 99
  • 158
Hojo.Timberwolf
  • 985
  • 1
  • 11
  • 32

1 Answers1

5

The Laplace of Gaussian is not directly separable into two 1D kernels. Therefore, imfilter will do a full convolution, which is quite expensive. But we can manually separate it out into simpler filters.


The Laplace of Gaussian is defined as the sum of two second-order-derivatives of the Gaussian:

LoG = d²/dx² G + d²/dy² G

The Gaussian itself, and its derivatives, are separable. Therefore, the above can be computed using 4 1D convolutions, which is much cheaper than a single 2D convolution unless the kernel is very small (e.g. if the kernel is 7x7, we need 49 multiplications and additions per pixel for the 2D kernel, or 4*7=28 multiplications and additions per pixel for the 4 1D kernels; this difference grows as the kernel gets larger). The computations would be:

sigma = 3;
cutoff = ceil(3*sigma);
G = fspecial('gaussian',[1,2*cutoff+1],sigma);
d2G = G .* ((-cutoff:cutoff).^2 - sigma^2)/ (sigma^4);
dxx = conv2(d2G,G,img,'same');
dyy = conv2(G,d2G,img,'same');
LoG = dxx + dyy;

If you are really strapped for time, and don't care about precision, you could set cutoff to 2*sigma (for a smaller kernel), but this is not ideal.


An alternative, less precise, is to separate the operation differently:

LoG * f = ( d²/dx² G + d²/dy² G ) * f
        = ( d²/dx²  * G + d²/dy²  * G ) * f
        = ( d²/dx^2  + d²/dy²  ) * G * f

(with * there representing convolution, and the Dirac delta, convolution's equivalent to multiplying by 1). The d²/dx² + d²/dy² operator doesn't really exist in the discrete world, but you can take the finite difference approximation, which leads to the famous 3x3 Laplace kernel:

[ 1  1  1             [ 0  1  0
  1 -8  1       or:     1 -4  1
  1  1  1 ]             0  1  0 ]

Now we get a rougher approximation, but it's faster to compute (2 1D convolutions, and a convolution with a trivial 3x3 kernel):

sigma = 3;
cutoff = ceil(3*sigma);
G = fspecial('gaussian',[1,2*cutoff+1],sigma);
tmp = conv2(G,G,img,'same');
h = fspecial('laplacian',0);
LoG = conv2(tmp,h,'same'); % or use imfilter
Cris Luengo
  • 55,762
  • 10
  • 62
  • 120
  • Thanks for explaining Laplace of Gaussian and the different methods for calculating them. After trying the methods, I found that `imfilter` still seems to be the quickest method for my code. That being said, I think I'm not actively calculating LoG with my code which is why it is quicker. I shall try and use `gpuArray` to speed up the processing speed. Though last time I did this, I ended up slowing down my processing speed. :D I shall see. Thanks again. – Hojo.Timberwolf Dec 07 '18 at 08:26
  • 2
    @Hojo.Timberwolf GPUs need quite a big time to send the memory and gather it from the GPU. If you can stack a few of the images, send them to GPU to process them together and gather the result you will get an speed improvement. – Ander Biguri Dec 07 '18 at 11:10
  • 1
    @Hojo: I don’t know what values you use for `filterdot, thresh`, but the comment says you make a 25x25 filter with sigma=25. This would indeed not be a Gaussian! In fact, it is [closer to a uniform filter in characteristics](https://www.crisluengo.net/archives/695). And a uniform filter (`fspecial('average'`) is **a lot** faster to compute. So if you’re happy with your filtering, try replacing it with a uniform filter. – Cris Luengo Dec 07 '18 at 14:03