0

I'm trying to do as described here: Finding a subimage inside a Numpy image to be able to search an image inside screenshot.

The code looks like that:

import cv2
import numpy as np
import gtk.gdk
from PIL import Image

def make_screenshot():
    w = gtk.gdk.get_default_root_window()
    sz = w.get_size()
    pb = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB, False, 8, sz[0], sz[1])
    pb = pb.get_from_drawable(w, w.get_colormap(), 0, 0, 0, 0, sz[0], sz[1])
    width, height = pb.get_width(), pb.get_height()
    return Image.fromstring("RGB", (width, height), pb.get_pixels())

if __name__ == "__main__":
    img = make_screenshot()
    cv_im = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
    template = cv_im[30:40, 30:40, :]
    result = cv2.matchTemplate(cv_im, template, cv2.TM_CCORR_NORMED)
    print np.unravel_index(result.argmax(), result.shape)

Depending on method selected (instead of cv2.TM_CCORR_NORMED) I'm getting completely different coordinates, but none of them is (30, 30) as in example.

Please, teach me, what's wrong with such approach?

Community
  • 1
  • 1
Enchantner
  • 1,534
  • 3
  • 20
  • 40

1 Answers1

0

Short answer: you need to use the following line to locate the corner of the best match:

minVal, maxVal, minLoc, maxLoc = cv2.minMaxLoc(result)

The variable maxLoc will hold a tuple containing the x, y indices of the upper lefthand corner of the best match.

Long answer:

cv2.matchTemplate() returns a single channel image where the number at each index corresponds to how well the input image matched the template at that index. Try visualizing result by inserting the following lines of code after your call to matchTemplate, and you will see why numpy would have a difficult time making sense of it.

cv2.imshow("Debugging Window", result)
cv2.waitKey(0)
cv2.destroyAllWindows()

minMaxLoc() turns the result returned by matchTemplate into the information you want. If you cared to know where the template had the worst match, or what value was held by result at the best and worst matches, you could use those values too.

This code worked for me on an example image that I read from file. If your code continues to misbehave, you probably aren't reading in your images the way you want to. The above snippet of code is useful for debugging with OpenCV. Replace the argument result in imshow with the name of any image object (numpy array) to visually confirm that you are getting the image you want.