1

I am trying to implement some sort of correlation tracking between a template image and frames of video stream in Python-OpenCV

I am trying to use weighted mean absolute deviation (weighted MAD) as a similarity measure between the template and video frames (object should be at location of minimum MAD.).

the equation I need to do is:

wmad.png

where F is the image, T is the template and w is weight window (same size as template)

I am aware that open-cv provides function which does template matching (i.e.: cv2.matchTemplate). The most close one to MAD is TM_SQDIFF_NORMED which is mean square deviation (MSD), I believe that open-cv implements this equation

msd.png

which will give the measure of similarity I want if there is a way to implement weight function inside it like this

wmsd.png

My question is how can I implement any of weighted MAD or weighted MSD in Open-CV without implementing loops myself (so as not to lose speed) utilizing cv2.matchTemplate function (or similar approach

Mohammed B.
  • 160
  • 1
  • 3
  • 13
  • Perhaps by using the weights as a mask image. See https://docs.opencv.org/4.1.1/df/dfb/group__imgproc__object.html#ga586ebfb0a7fb604b35a23d85391329be for the mask option – fmw42 May 20 '21 at 19:26
  • @fmw42 The documentation says that the mask is multiplied by the template, which is not the equation illustrated. https://gregorkovalcik.github.io/opencv_contrib/tutorial_template_matching.html – Mohammed B. May 20 '21 at 20:03
  • It was just a thought. Otherwise, I do not see any way to add weights to the current matchTemplate() without the code being reprogrammed. So one would have to code your own routine. – fmw42 May 20 '21 at 21:14
  • @fmw42 I tried implementing it, but it is too slow for normal video files. I have no problem in implementing it if I can have reasonable speed – Mohammed B. May 20 '21 at 23:01
  • A loop would be too slow, I've implemented TM_CCOEFF using numpy.convolve (or scipy). So would you be able to turn the equations into a convolution? – Ta946 May 21 '21 at 06:26
  • [Check out the answer given here, see if it helps](https://stackoverflow.com/questions/41330517/compute-mean-squared-absolute-deviation-and-custom-similarity-measure-python?rq=1) – DrBwts May 21 '21 at 08:16
  • I have checked it before posting my question, its problem is still the same, how can I apply weighing on it? – Mohammed B. May 21 '21 at 11:58

2 Answers2

1

You can do it with small matrix tricks. I will try to explain with an example.
If you have a 3x3 kernel with values of k_ij and 3x3 weight kernel w_ij, you can create 9 images from your original image by moving it once each time in each direction. You will end up with 9 images.
Now, you can flatten the kernel t and subtract it from the stacked 9 images. The result will be equivalent to moving kernel.
After taking the absolute value, you can do the same (flattening and multiplying) for w.
Finally, you can sum the tensor in the new axis and end up with the solution.

example of implementation:

def stack_image(image, n):
    channels = []
    row, col = image.shape
    for i in range(n):
        for j in range(n):
            channels.append(image[i:row-(n - i)+1, j:col-(n - j)+1])
    return np.stack(channels, axis=-1)

def weighted_mad(f, t, w):
    image_stack = stack_image(image=f, n=t.shape[0])
    image_stack = np.abs(image_stack - t.flatten()) * w.flatten()
    image_stack = image_stack.sum(axis=-1)

    norm = len(image_stack.flatten())
    return 1 / norm * image_stack

Notes:

  • my implementation does not process the borders ("valid"), one can implements it in other ways.
  • my implementation assumes a square kernel (kxk), but one can implement it with a rectangular one.
  • the solution will be efficient only if the kernel size is not too large.
TzviLederer
  • 117
  • 1
  • 8
  • `@TxviLederer` Nice concept. Very clever. But perhaps replace your `def stack_image()` with np.roll() to shift your image 9 times. See https://numpy.org/doc/stable/reference/generated/numpy.roll.html – fmw42 May 21 '21 at 18:39
  • May you add more Mathematical explanation, as I am not getting your idea well. – Mohammed B. May 21 '21 at 20:07
  • @fmw42 good idea.. but if you do it you will need anyway to crop the edges (np.roll() will bring the end of the rows/columns to the beginning). – TzviLederer May 22 '21 at 17:53
  • `@TzviLederer` Right. I forgot about that. But are you not effectively doing the same. The boundary pixels of the image will not have the needed neighbors to which to shift. One could pad the input image by 2 pixels all around perhaps by an unfolding operation. Then either method would have the needed data, though it might not be relevant. However, that is a minor issue. – fmw42 May 22 '21 at 18:00
  • @MohamedIbrahim the concept is that in each iteration of the kernel, instead of sum the rows and the columns from 1 to m and from 1 to n, you can stack the original image in the third dimension (with moves) so that in each pixel i, j, the values in the third dimension will be the same values that the original kernel would "catch". Now, instead of moving a kernel spacially, you can do the subtraction, abs, and weight in the third dimension. I hope it is clear. – TzviLederer May 22 '21 at 18:05
  • @fmw42 you are right. If you want to pad the image in a cyclic way - this is the solution. – TzviLederer May 22 '21 at 18:08
0

I will answer my question inspired by this answer Compute mean squared, absolute deviation and custom similarity measure - Python/NumPy

I decided to go with weighted MSD (Mean Square Deviation) as I can expand the square bracket and distribute the weight on the three terms. Here are the steps

1- Expanding the square bracket

[wmad.png

2- Distributing window kernel on the expanded bracket two.ong

3-Distributing the two sum operators on each term we will end having three terms

3.a- Convolution between Image square (F^2) and Window(W)

3.b- -2 * Convolution between Image (F) and window*template (T * W, element-wise)

3.c- summation of template square T^2 * window (w) (element-wise)

and multiply by (1/(m*n)) at the end

here is how to do it in python open-cv

def wmsd( img, tmp ,W):
    # input: img= image
    # input: tmp= template
    # input: W= weighting window
    # return: msd_map= weighted mean square deviation map

    h,w = img.shape
    th,tw = tmp.shape
    img_sq=np.square(np.uint64(img))
    tmp_sq=np.square(np.uint64(tmp))

    p1=cv2.filter2D(img_sq,-1,cv2.flip(W,-1),0)

    WT=W*tmp
    p2=-2*cv2.filter2D(img,-1,cv2.flip(WT,-1),0)

    p3=np.sum(tmp_sq*W)     
    msd_map=(p1+p2+p3)/(th*tw)

    return msd_map

Seeing it this way, makes it easy to utilize open-cv power to make this operation quickly with good fps

Mohammed B.
  • 160
  • 1
  • 3
  • 13