0

currently working on a side project, but I'm stuck on one big part.

The goal is that the user can take a screenshot from a different popular app that contains 6 images/icons. I want it so when the user goes into my app they can upload that screenshot and I can detect the 6 images and place them into a collection view.

The issue is detecting the type of 6 images in the screen shot, I thought about using an OCR like Tesseract but I'm not sure if that would work because there's zero text in the screenshot, only the 6 images. Something that might help is that in that app there all only 50 kind of images. Would create some sort of database of images help? But how would I compare them?

I apologise if this doesn't make sense I just don't know how to word it. Any help would be great.

Gum
  • 15
  • 6
  • So u want to compare a image against a set of images and thus give it a label? – Slim Shady May 05 '17 at 19:06
  • Yes, I want to detect the 6 images from the screenshot and compare them against a set of images. Just not sure how to compare them against each other. – Gum May 05 '17 at 19:08
  • I think the goal is to use something similar to face recognition to try and locate the 6 images in the screen shot. That sounds like a sophisticated computer vision type problem. – Scott Thompson May 05 '17 at 19:08
  • you might want to look into Perceptual Hashing if its something very simple similar images. else computer vision is the right way to go. – Slim Shady May 05 '17 at 19:13
  • My first thought on solving this problem was to use computer vision. A library like OpenCV might be a good choice. Take a look at http://stackoverflow.com/questions/4196453/simple-and-fast-method-to-compare-images-for-similarity – Shackleford May 05 '17 at 19:36

2 Answers2

0

Assuming you want to be able to do this across multiple types of devices, a computer vision library like OpenCV might be the way to go.

If your users always run the app on the same device (always on an iPhone 5, say) then the icons might always land in exactly the same spot, and you could simply slice the screenshot up, extract the component images, and do a byte-wise compare on the sub-images. However, you've got iPhone 4, iPhone 5, iPhone 6, 6+ screen-sizes, iPad, iPad retina, iPad pro (small and large) to deal with, and possibly portrait and landscape orientations. Presumably the 6 images will land at different spots on the screens of all those different devices, and you'll have different image resolutions to deal with as well. With OpenCV you should be able to find the bounding rects for the images by "looking at" the screen-shots rather than building a complex set of rules.

Duncan C
  • 128,072
  • 22
  • 173
  • 272
0

Take a look at the OpenCV example code for matching SIFT features (the python version here, but you can find examples in other languages as well). It demonstrates a simpler version of what you want to do.

Totoro
  • 3,398
  • 1
  • 24
  • 39