5

The documentation of opencv matchtemplate says that

In case of a color image, template summation in the numerator and each sum in the denominator is done over all of the channels and separate mean values are used for each channel. That is, the function can take a color template and a color image. The result will still be a single-channel image, which is easier to analyze.

I can't figure out what it means. For a color template and a color image, is the single-channel image (the result) the average of the results of all the channels?

templmatch.cpp source code: github

kchomski
  • 2,872
  • 20
  • 31
sunyy5
  • 53
  • 5

1 Answers1

2

Looking at the source code of the convolve_32F function used in matchTemplate, it seems that template matching on color images actually transforms the color image and the color template into gray images with three times as many columns before applying the convolution between the image and the template as grayscale images.

To illustrate how the conversion to a gray image is done, consider the following 2x2 image with 4 color pixels (written with BGR values):

(1, 2, 3) (4, 5, 6)
(7, 8, 9) (10,11,12)

It becomes the following 2x6 gray image:

(1)  (2)  (3)  (4)  (5)  (6)
(7)  (8)  (9)  (10) (11) (12)

They perform the convolution same as if it were gray images and then extract the result by taking one value out of three in the result image (equivalent to extracting the first channel of a color image).

Sunreef
  • 4,452
  • 21
  • 33
  • Thank you! As there are three values, how can I know which value does it take in the result image? Why it is equivalent to extracting the first channel of a color image, not the other channels? – sunyy5 Jul 18 '18 at 02:35
  • For a convolution on a color image with a color template, there are not three values as you say. Take a look at the first image on [this page](http://machinethink.net/blog/googles-mobile-net-architecture-on-iphone/) to get an idea of what it looks like. With OpenCV's trick of transforming the color image in a gray one, it becomes necessary to take only one value out of three because the other two are not aligned with real pixels from the original color image. – Sunreef Jul 18 '18 at 06:37
  • Thank you! I still can't figure out. Can you show me the detailed process of getting the result of an image using CV_TM_SQDIFF as the template matching method? Suppose that the image is what you have given above. – sunyy5 Jul 18 '18 at 09:55
  • What does "three" refer to in "take one value out of three"? – sunyy5 Jul 19 '18 at 02:16
  • @sunyy5 The number 3 ?... I'm not sure I get your question. – Sunreef Jul 19 '18 at 07:31
  • Maybe I didn’t express it clearly. For the 2x6 gray image, we can get the 2x6 result image according to the template matching method. As the origin color image is 2x2, how can it get the 2x2 result image from the 2x6 gray image? – sunyy5 Jul 19 '18 at 12:47
  • You take the values [0,0], [0,3], [1, 0] and [1,3]. Which is what I meant by taking one out of three since you're taking one value, skipping two, taking one,.... @sunyy5 – Sunreef Jul 19 '18 at 12:49