How to detect an object in an image rather than screen with pyautogui?

Question

I am using pyautogui.locateOnScreen() function to locate elements in chrome and get their x,y coordinates and click them. But at some point I need to take a screenshot of a part of the screen and search for the object I want in this screenshot. Then I get coordinates of it. Is it possible to do it with pyautogui? My example code:

coord_one = pyautogui.locateOnScreen("first_image.png",confidence=0.95)
scshoot = pyautogui.screenshot(region=coord_one)
coord_two = # search second image in scshoot and if it can be detected get coordinates of it.

If it is not possible with pyautogui, can you advice the easiest-smartest way? Thanks in advance.

While not a direct duplicate question since the way you are asking this assumes that pyauto-gui will have a built-in method to do it, this was already answered [here](https://stackoverflow.com/questions/876142/finding-cropped-similar-images). — ferreiradev, May 21 '22 at 02:15
Different phrasing, but your question may also have been answered [here](https://stackoverflow.com/questions/56222990/template-matching-from-screen-capture). — ferreiradev, May 21 '22 at 02:17
Both of your examples have no relationship with pyautogui module. I am teaching pyautogui module to my students so I was checking if there are a solution with pyautogui. I knew I could do this with opencv. I am grateful for your long reply but I don't think you understand well the question so you downwote. (I did my research and there are tons of pages with opencv, no image search inside another image page with pyautogui) @MFerreira — Hasan Onur ATAÇ, May 21 '22 at 10:14

ferreiradev · Answer 1 · 2022-05-21T02:19:24.643

I don't believe there is a built-in direct way to do what you need but the python-opencv library does the job.

The following code sample assumes you have an screen capture you just took "capture.png" and you want to find "logo.png" in that capture, which you know is an subsection of "capture.png".

Minimal example

"""Get bounding box of cropped image from original image."""

import cv2 as cv
import numpy as np


img_rgb = cv.imread(r'res/original.png')
# the cropped image, expected to be smaller
target_img = cv.imread(r'res/crop.png') 

_, w, h = target_img.shape[::-1]
res = cv.matchTemplate(img_rgb,target_img,cv.TM_CCOEFF_NORMED)

# with the method used, the date in res are top left pixel coords
min_val, max_val, min_loc, max_loc = cv.minMaxLoc(res)    
top_left = max_loc

# if we add to it the width and height of the target, then we get the bbox.
bottom_right = (top_left[0] + w, top_left[1] + h)

cv.rectangle(img_rgb,top_left, bottom_right, 255, 2)
cv.imshow('', img_rgb)

MatchTemplate

From the docs, MatchTemplate "simply slides the template image over the input image (as in 2D convolution) and compares the template and patch of input image under the template image." Under the hood, this offers methods such as square difference to compare the images represented as arrays.

See more

For a more in-depth explanation, check the opencv docs as the code is entirely based off their example.

How to detect an object in an image rather than screen with pyautogui?

1 Answers1

Minimal example

MatchTemplate

See more