4

Here is the image I need to detect: http://s13.postimg.org/wt8qxoco3/image.png

Here is the base64 representation: http://pastebin.com/raw.php?i=TZQUieWe

The reason why I'm asking for your help is because this is a complex problem and I am not equipped to solve it. It will probably take me a week to do it by myself.

Some pseudo-code that I thought about:

1) Take screenshot of the app and store it as image object.

2) Convert binary64 representation of my image to image object.

3) Use some sort of algorithm/function to compare both image objects.

By on screen, I mean in an app. I have the app's window name and the PID.

To be 100% clear, I need to essentially detect if image1 is inside image2. image1 is the image I gave in the OP. image2 is a screenshot of a window.

user1251385
  • 197
  • 1
  • 1
  • 8
  • 1
    On screen, where? In a browser, on your desktop, in an app? – sberry Mar 29 '13 at 18:41
  • @sberry: Actually, ignore what I said earlier. In an app. I have the window name of the app and the PID. – user1251385 Mar 29 '13 at 18:45
  • SURF detector is perfect for your needs. Here is an [example](http://stackoverflow.com/a/10987035/723891) – Igonato Mar 29 '13 at 18:57
  • You need to detect that an exact pixel-by-pixel copy of the image, right? If not, this is a whole lot harder. – abarnert Mar 29 '13 at 19:03
  • @abarnert: No unfortunately. I need to essentially detect if image1 is inside image2. image1 is the image I gave in the OP. image2 is a screenshot of a window. – user1251385 Mar 29 '13 at 19:09
  • @Igonato: It doesn't seem like there is an actual algorithm for comparing the two. It only seems to color the number of points that match between the two. – user1251385 Mar 29 '13 at 19:10
  • Yes, but by "is inside", you mean "there's an exactly pixel-for-pixel copy of image1 somewhere in image2", right? – abarnert Mar 29 '13 at 19:11

3 Answers3

3

If you break this down into pieces, they're all pretty simple.

First, you need a screenshot of the app's window as a 2D array of pixels. There are a variety of different ways to do this in a platform-specific way, but you didn't mention what platform you're on, so… let's just grab the whole screen, using PIL:

screenshot = ImageGrab.grab()
haystack = screenshot.load()

Now, you need to convert your base64 into an image. Taking a quick look at it, it's clearly just an encoded PNG file. So:

decoded = data.decode('base64')
f = cStringIO.StringIO(decoded)
image = Image.open(f)
needle = image.load()

Now you've got a 2D array of pixels, and you want to see if it exists in another 2D array. There are faster ways to do this—using numpy is probably best—but there's also a dumb brute-force way, which is a lot simpler to understand: just iterate the rows of haystack; for each one, iterate the columns, and see if you find a run of bytes that matches the first row of needle. If so, keep going through the rest of the rows until you either finish all of needle, in which case you return True, or find a mismatch, in which case you continue and just start again on the next row.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • Wow. Thanks! Awesome needle/haystack analogy. I should've thought of that myself. I asked the question pretty poorly. – user1251385 Mar 29 '13 at 19:20
  • Hmm... I'm getting this error: NameError: name 'data' is not defined I'm using an import, aren't I? – user1251385 Mar 29 '13 at 20:04
  • Oh, `data` is the base64 data you read from wherever. If it's a local file, just do `with open('data.b64') as f: data = f.read()`. If it's on the web, use `urllib2` or `requests`. And so on. – abarnert Mar 29 '13 at 20:06
  • BTW, I didn't come up with the "needle/haystack" thing. I'm not sure who did, but the documentation for a pretty wide variety of languages—Smalltalk, PHP, Haskell, …—use them as the parameter names for find/search functions. – abarnert Mar 29 '13 at 20:08
  • Meanwhile, did you figure out everything you _do_ need to import? It should just be `import cStringIO`, `from PIL import Image`, and `from PIL import ImageGrab`… but PIL's package/module structure is a bit funky, so if that doesn't work, let me know. – abarnert Mar 29 '13 at 20:11
  • So I'm using this code: http://pastebin.com/0q7C9XZM but I'm getting this error TypeError: 'PixelAccess' object is not iterable for line 3. By the way, the total number of pixels is 980 for the image that I'm using. – user1251385 Mar 29 '13 at 20:12
  • Right, you can't directly iterate a PIL pixel access array, and you also can't access the rows as real objects; you have to do `for rowidx in range(screenshot.size[1]): for colidx in range(screenshot.size[0]): pix = haystack[rowidx, colidx]`. (Or something like that.) It may be easier to just copy them both into some better data structure. If you can use `numpy`/`scipy`, a 2D `ndarray` is probably the best data structure, and someone's probably already written the code for you, and it'll be an order of magnitude faster too. – abarnert Mar 29 '13 at 20:19
1

this is probably the best place to start:

http://effbot.org/imagingbook/image.htm

if you don't have access to the image's meta data, file name, type, etc, what you're trying to do is very difficult, but your pseudo sounds on-point. essentially, you'll have to create an algorithmic model based on a photo's shapes, lines, size, colors, etc. then you'd have to match that model against models already made and indexed in some database. hope that helps.

Jonathan Root
  • 535
  • 2
  • 14
  • 31
terra823
  • 92
  • 1
  • 10
  • Actually, if you decode his base64, it's just an entire PNG file, so he has access to everything he needs to convert it into an array of pixels with PIL. So, this works. – abarnert Mar 29 '13 at 19:05
  • Why would it be easier if I had access to the image's meta data? To be 100% clear, I need to essentially detect if image1 is inside image2. image1 is the image I gave in the OP. image2 is a screenshot of a window. image2 won't have any metadata because it's a screenshot only. – user1251385 Mar 29 '13 at 19:12
  • @user1251385: I think he's saying you need the metadata to figure out how to map the bits to pixels. (In other words, you need the PNG header, which you have.) – abarnert Mar 29 '13 at 19:59
  • abarnert, extactly. for some reason i was under the impression this was for random images. it sounds like you might not need image detection to do what you're talking about doing. for instance, you could set up a function that registers if a screenshot has been taken. again, i'm not sure what you're looking to do. – terra823 Mar 29 '13 at 20:22
0

It looks like https://python-pillow.org/ is a more updated version of PIL.

Sako73
  • 9,957
  • 13
  • 57
  • 75
  • It's better to use comments for answers like these. If you want to add them as an answer, then please include the relevant details that are needed to completely answer the questions. Thanks and have a great day. – Bhargav Rao Nov 09 '16 at 19:38