3

I have 2 bmp images. ImageA is a screenshot (example) ImageB is a subset of that. Say for example, an icon.

I want to find the X,Y coordinates of ImageB within ImageA (if it exists).

Any idea how I would do that?

greyb3ast
  • 79
  • 1
  • 7

1 Answers1

5

This is called optical-recognition. It may seem complicated (it is) but can be very simple in implementation, so don't shy away from it!

Let Image A be the image we're looking for, and Image B be the larger image with Image A in it.

Method 1

If Image A's scale in Image B hasn't been altered, and the colors are all preserved, you can place Image B on an HTML 5 canvas and iterate over the pixel data. You would load the first line of pixels from Image A and then iterate over every pixel in Image B. If a pixel was the same, you would store that pixels column in a variable and check if the next matched too. If the first row was a full match, then hop to the next row and compare those. You'd repeat that until you either got a match or hit an (or enough) pixels that didn't match. In that case, you would reset all variables and start all over again looking for a match to row 1.

Method 2

If Image A isn't perfectly identical in Image B, new complications arise and things become a lot more complicated. If only the scale changes, we can make a few tweaks to Method 1 to get something that works. Instead of grabbing any pixel and seeing if 80% or so matches, we additionally need to track the images sheer/compression.

In each row, go over pixel incrementally. For example, we'll check every tenth pixel. If we find a match for pixel 1, we then check 10 pixels away and see if that pixel exists anywhere in our row. If we find it, the distance from 0 to that pixel divided by 10 (our increment) is how many times larger the original image is.

If we found a pixel 20 slots from 0 in Image A, and it was only 10 pixels apart in Image B (remember, 10 is our increment), then our original image was 2 times larger. In other words, the new image is half the size of the original.

1) compression = target_width / original_width
2) compression = 20 / 10
3) compression = 2

This is a much more complex but robust way to detect a match. Enough matching rows mean you've got a matching image, but what about vertical stretching?

Similar logic. If you find a row that matches, start at 0 and go down by 10, then find that pixel's match in Image A.

Edit

The methods I provided are generic methods to work with looking for any image inside any other image. As you can imagine this is performance intensive. I don't know what image you're trying to detect but if there are common shapes, sometimes you can do alternative algorithms. If you have a circle, for example, you can just check that there are pixels that match outside a radius and pixels that are the same within.

The methods I presented also don't compensate for warping. Method 2 should be fine if the image is stretched but keeps a rectangular ratio. If the image has for example been warped into a circle shape, things get infinitely more complicated. For that case, the only hint I could give would be to check pixels within a radius of the original for matches.

mcfish
  • 102
  • 1
  • 10
  • If you'd like some JS libraries that do this, I can fetch a few for you too, but typing in optical recognition should get you far more than enough information! Good luck. – mcfish Oct 06 '17 at 21:36
  • I was looking for libraries at npm but the only one i found is OpenCV and there is not much tutorials about this topic :( – greyb3ast Oct 07 '17 at 13:11
  • I'm working on a project with optical recognition and will likely release the functionality as a library once I put together something solid. However presently, this is a fairly non-trivial task and most people have to invent the wheel with every go. – mcfish Oct 09 '17 at 02:21