Matching image to images collection

Question

I have large collecton of card images, and one photo of particular card. What tools can I use to find which image of collection is most similar to mine?

Here's collection sample:

Here's what I'm trying to find:

Card Photo

Mmmm... in what way is your one image different - is it brighter/darker, rotated/distorted/shifted, is it a different size, is it a different format (JPEG/PNG), or has a single smallish element moved within the image but the rest is pixel for pixel identical, or.... ? — Mark Setchell, Aug 08 '14 at 09:03
Let's say it's printed out and photographed by fixed camera from above on a white backdrop. It's usually brighter, may be a little bit distorted/rotated. — Kuroki Kaze, Aug 08 '14 at 11:10
It's hard to advise on the information you have provided. Can you post maybe 2-3 images from the big collection and the odd one that you are trying to match to your collection? — Mark Setchell, Aug 08 '14 at 12:17
Do you want to find similar images or just recognize a specific card? If its the latter then character recoqnition could be used to read the card names instead of comparing the images. You could then create a database of your collection and compare with that. — Ghaul, Aug 13 '14 at 11:41
I want to recognize card. I have a feeling i'm more likely to find match comparing whole frames. But I can try. — Kuroki Kaze, Aug 13 '14 at 13:23
It is important to note that testing any approach with just 3 samples does not mean it gonna work with more cards. There is bias in the samples. For instance, I could develop an algorithm that find the card with gray background. The other two cards have green background. It probably gonna work. Image comparison algorithms will probably solve this problem perfectly considering this scenario when your desired sample is so different from the others. Try to put more similar cards in the samples. I suggest you to put more cards with gray background and the same symbols. — Gabriel Archanjo, Aug 18 '14 at 11:43
For anyone interested, I think you can use images available [here](http://wafry.com/MAGIC/10th.htm) as reference images. But more test images would certainly help to evaluate a method. With reference images from above link and the one test image available, I used the euclidean distance to find the best match as I've outlined in my answer EDIT section, and it gave me good results for this particular test image. — dhanushka, Aug 19 '14 at 01:27
Crazy idea. What about training a neural network to recognise cards for you? I have no idea how, but the cool factor alone outweighs any meaningless concerns like "feasibility" or "timeliness". — Todd Bowles, Aug 20 '14 at 08:42
"I have no idea how to" pretty well describes my stance on neural networks in this task. — Kuroki Kaze, Aug 20 '14 at 08:43

Mark Setchell · Answer 1 · 2014-08-14T15:50:31.487

5

New method!

It seems that the following ImageMagick command, or maybe a variation of it, depending on looking at a greater selection of your images, will extract the wording at the top of your cards

convert aggressiveurge.jpg -crop 80%x10%+10%+10% crop.png

which takes the top 10% of your image and 80% of the width (starting at 10% in from the top left corner and stores it in crop.png as follows:

enter image description here

And if your run that through tessseract OCR as follows:

tesseract crop.png agg

you get a file called agg.txt containing:

E‘ Aggressive Urge \L® E

which you can run through grep to clean up, looking only for upper and lower case letters adjacent to each other:

grep -Eo "\<[A-Za-z]+\>" agg.txt

to get

Aggressive Urge

:-)

edited Aug 14 '14 at 15:50

answered Aug 13 '14 at 21:35

Mark Setchell

191,897
31
273
432

Will be funny if it's the best method :) I'm currently trying them out. – Kuroki Kaze Aug 18 '14 at 08:38
This gives problems if photographed card is even slightly tilted to the side. Or maybe I'm not doing it right :/ – Kuroki Kaze Aug 20 '14 at 08:44

score 3 · Accepted Answer · answered Aug 08 '14 at 13:18

Thank you for posting some photos.

I have coded an algorithm called Perceptual Hashing which I found by Dr Neal Krawetz. On comparing your images with the Card, I get the following percentage measures of similarity:

Card vs. Abundance 79%
Card vs. Aggressive 83%
Card vs. Demystify 85%

so, it is not an ideal discriminator for your image type, but kind of works somewhat. You may wish to play around with it to tailor it for your use case.

I would calculate a hash for each of the images in your collection, one at a time and store the hash for each image just once. Then, when you get a new card, calculate its hash and compare it to the stored ones.

#!/bin/bash
################################################################################
# Similarity
# Mark Setchell
#
# Calculate percentage similarity of two images using Perceptual Hashing
# See article by Dr Neal Krawetz entitled "Looks Like It" - www.hackerfactor.com
#
# Method:
# 1) Resize image to black and white 8x8 pixel square regardless
# 2) Calculate mean brightness of those 64 pixels
# 3) For each pixel, store "1" if pixel>mean else store "0" if less than mean
# 4) Convert resulting 64bit string of 1's and 0's, 16 hex digit "Perceptual Hash"
#
# If finding difference between Perceptual Hashes, simply total up number of bits
# that differ between the two strings - this is the Hamming distance.
#
# Requires ImageMagick - www.imagemagick.org
#
# Usage:
#
# Similarity image|imageHash [image|imageHash]
# If you pass one image filename, it will tell you the Perceptual hash as a 16
# character hex string that you may want to store in an alternate stream or as
# an attribute or tag in filesystems that support such things. Do this in order
# to just calculate the hash once for each image.
#
# If you pass in two images, or two hashes, or an image and a hash, it will try
# to compare them and give a percentage similarity between them.
################################################################################
function PerceptualHash(){

   TEMP="tmp$$.png"

   # Force image to 8x8 pixels and greyscale
   convert "$1" -colorspace gray -quality 80 -resize 8x8! PNG8:"$TEMP"

   # Calculate mean brightness and correct to range 0..255
   MEAN=$(convert "$TEMP" -format "%[fx:int(mean*255)]" info:)

   # Now extract all 64 pixels and build string containing "1" where pixel > mean else "0"
   hash=""
   for i in {0..7}; do
      for j in {0..7}; do
         pixel=$(convert "${TEMP}"[1x1+${i}+${j}] -colorspace gray text: | grep -Eo "\(\d+," | tr -d '(,' )
         bit="0"
         [ $pixel -gt $MEAN ] && bit="1"
         hash="$hash$bit"
      done
   done
   hex=$(echo "obase=16;ibase=2;$hash" | bc)
   printf "%016s\n" $hex
   #rm "$TEMP" > /dev/null 2>&1
}

function HammingDistance(){
   # Convert input hex strings to upper case like bc requires
   STR1=$(tr '[a-z]' '[A-Z]' <<< $1)
   STR2=$(tr '[a-z]' '[A-Z]' <<< $2)

   # Convert hex to binary and zero left pad to 64 binary digits
   STR1=$(printf "%064s" $(echo "obase=2;ibase=16;$STR1" | bc))
   STR2=$(printf "%064s" $(echo "obase=2;ibase=16;$STR2" | bc))

   # Calculate Hamming distance between two strings, each differing bit adds 1
   hamming=0
   for i in {0..63};do
      a=${STR1:i:1}
      b=${STR2:i:1}
      [ $a != $b ] && ((hamming++))
   done

   # Hamming distance is in range 0..64 and small means more similar
   # We want percentage similarity, so we do a little maths
   similarity=$((100-(hamming*100/64)))
   echo $similarity
}

function Usage(){
   echo "Usage: Similarity image|imageHash [image|imageHash]" >&2
   exit 1
}

################################################################################
# Main
################################################################################
if [ $# -eq 1 ]; then
   # Expecting a single image file for which to generate hash
   if [ ! -f "$1" ]; then
      echo "ERROR: File $1 does not exist" >&2
      exit 1
   fi
   PerceptualHash "$1" 
   exit 0
fi

if [ $# -eq 2 ]; then
   # Expecting 2 things, i.e. 2 image files, 2 hashes or one of each
   if [ -f "$1" ]; then
      hash1=$(PerceptualHash "$1")
   else
      hash1=$1
   fi
   if [ -f "$2" ]; then
      hash2=$(PerceptualHash "$2")
   else
      hash2=$2
   fi
   HammingDistance $hash1 $hash2
   exit 0
fi

Usage

http://www.phash.org/ is a command line tool for calculating perceptual hashes of various kinds. Available in both Linux and Windows flavours. — Todd Bowles, Aug 20 '14 at 23:15
If instead of throwing away the color you were to calculate the hash separately for each of R,G,B compared to their average across the image, surely it would do much better -- a reddish photo would hash very differently from a greenish photo. — AmigoNico, Oct 18 '20 at 08:27

Mark Setchell · Answer 3 · 2014-08-08T17:59:43.203

I also tried a normalised cross-correlation of each of your images with the card, like this:

#!/bin/bash
size="300x400!"
convert card.png -colorspace RGB -normalize -resize $size card.jpg
for i in *.jpg
do 
   cc=$(convert $i -colorspace RGB -normalize -resize $size JPG:- | \
   compare - card.jpg -metric NCC null: 2>&1)
   echo "$cc:$i"
done | sort -n

and I got this output (sorted by match quality):

0.453999:abundance.jpg
0.550696:aggressive.jpg
0.629794:demystify.jpg

which shows that the card correlates best with demystify.jpg.

Note that I resized all images to the same size and normalized their contrast so that they could be readily compared and effects resulting from differences in contrast are minimised. Making them smaller also reduces the time needed for the correlation.

dhanushka · Answer 4 · 2014-08-17T08:22:26.180

I tried this by arranging the image data as a vector and taking the inner-product between the collection image vectors and the searched image vector. The vectors that are most similar will give the highest inner-product. I resize all the images to the same size to get equal length vectors so I can take inner-product. This resizing will additionally reduce inner-product computational cost and give a coarse approximation of the actual image.

You can quickly check this with Matlab or Octave. Below is the Matlab/Octave script. I've added comments there. I tried varying the variable mult from 1 to 8 (you can try any integer value), and for all those cases, image Demystify gave the highest inner product with the card image. For mult = 8, I get the following ip vector in Matlab:

ip =

683007892

558305537

604013365

As you can see, it gives the highest inner-product of 683007892 for image Demystify.

% load images
imCardPhoto = imread('0.png');
imDemystify = imread('1.jpg');
imAggressiveUrge = imread('2.jpg');
imAbundance = imread('3.jpg');

% you can experiment with the size by varying mult
mult = 8;
size = [17 12]*mult;

% resize with nearest neighbor interpolation
smallCardPhoto = imresize(imCardPhoto, size);
smallDemystify = imresize(imDemystify, size);
smallAggressiveUrge = imresize(imAggressiveUrge, size);
smallAbundance = imresize(imAbundance, size);

% image collection: each image is vectorized. if we have n images, this
% will be a (size_rows*size_columns*channels) x n matrix
collection = [double(smallDemystify(:)) ...
    double(smallAggressiveUrge(:)) ...
    double(smallAbundance(:))];

% vectorize searched image. this will be a (size_rows*size_columns*channels) x 1
% vector
x = double(smallCardPhoto(:));

% take the inner product of x and each image vector in collection. this
% will result in a n x 1 vector. the higher the inner product is, more similar the
% image and searched image(that is x)
ip = collection' * x;

EDIT

I tried another approach, basically taking the euclidean distance (l2 norm) between reference images and the card image and it gave me very good results with a large collection of reference images (383 images) I found at this link for your test card image.

Here instead of taking the whole image, I extracted the upper part that contains the image and used it for comparison.

In the following steps, all training images and the test image are resized to a predefined size before doing any processing.

extract the image regions from training images
perform morphological closing on these images to get a coarse approximation (this step may not be necessary)
vectorize these images and store in a training set (I call it training set even though there's no training in this approach)
load the test card image, extract the image region-of-interest(ROI), apply closing, then vectorize
calculate the euclidean distance between each reference image vector and the test image vector
choose the minimum distance item (or the first k items)

I did this in C++ using OpenCV. I'm also including some test results using different scales.

#include <opencv2/opencv.hpp>
#include <iostream>
#include <algorithm>
#include <string.h>
#include <windows.h>

using namespace cv;
using namespace std;

#define INPUT_FOLDER_PATH       string("Your test image folder path")
#define TRAIN_IMG_FOLDER_PATH   string("Your training image folder path")

void search()
{
    WIN32_FIND_DATA ffd;
    HANDLE hFind = INVALID_HANDLE_VALUE;

    vector<Mat> images;
    vector<string> labelNames;
    int label = 0;
    double scale = .2;  // you can experiment with scale
    Size imgSize(200*scale, 285*scale); // training sample images are all 200 x 285 (width x height)
    Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));

    // get all training samples in the directory
    hFind = FindFirstFile((TRAIN_IMG_FOLDER_PATH + string("*")).c_str(), &ffd);
    if (INVALID_HANDLE_VALUE == hFind) 
    {
        cout << "INVALID_HANDLE_VALUE: " << GetLastError() << endl;
        return;
    } 
    do
    {
        if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
        {
            Mat im = imread(TRAIN_IMG_FOLDER_PATH+string(ffd.cFileName));
            Mat re;
            resize(im, re, imgSize, 0, 0);  // resize the image

            // extract only the upper part that contains the image
            Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
            // get a coarse approximation
            morphologyEx(roi, roi, MORPH_CLOSE, kernel);

            images.push_back(roi.reshape(1)); // vectorize the roi
            labelNames.push_back(string(ffd.cFileName));
        }

    }
    while (FindNextFile(hFind, &ffd) != 0);

    // load the test image, apply the same preprocessing done for training images
    Mat test = imread(INPUT_FOLDER_PATH+string("0.png"));
    Mat re;
    resize(test, re, imgSize, 0, 0);
    Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
    morphologyEx(roi, roi, MORPH_CLOSE, kernel);
    Mat testre = roi.reshape(1);

    struct imgnorm2_t
    {
        string name;
        double norm2;
    };
    vector<imgnorm2_t> imgnorm;
    for (size_t i = 0; i < images.size(); i++)
    {
        imgnorm2_t data = {labelNames[i], 
            norm(images[i], testre) /* take the l2-norm (euclidean distance) */};
        imgnorm.push_back(data); // store data
    }

    // sort stored data based on euclidean-distance in the ascending order
    sort(imgnorm.begin(), imgnorm.end(), 
        [] (imgnorm2_t& first, imgnorm2_t& second) { return (first.norm2 < second.norm2); });
    for (size_t i = 0; i < imgnorm.size(); i++)
    {
        cout << imgnorm[i].name << " : " << imgnorm[i].norm2 << endl;
    }
}

Results:

scale = 1.0;

demystify.jpg : 10989.6, sylvan_basilisk.jpg : 11990.7, scathe_zombies.jpg : 12307.6

scale = .8;

demystify.jpg : 8572.84, sylvan_basilisk.jpg : 9440.18, steel_golem.jpg : 9445.36

scale = .6;

demystify.jpg : 6226.6, steel_golem.jpg : 6887.96, sylvan_basilisk.jpg : 7013.05

scale = .4;

demystify.jpg : 4185.68, steel_golem.jpg : 4544.64, sylvan_basilisk.jpg : 4699.67

scale = .2;

demystify.jpg : 1903.05, steel_golem.jpg : 2154.64, sylvan_basilisk.jpg : 2277.42

ekostadinov · Answer 5 · 2014-08-20T05:25:21.207

If i understand you correctly you need to compare them as pictures. There is one very simple, but effective solution here - it's called Sikuli.

What tools can I use to find which image of collection is most similar to mine?

This tool is working very good with the image-processing and is not only capable to find if your card(image) is similar to what you have already defined as pattern, but also search partial image content (so called rectangles).

By default you can extend it's functionality via Python. Any ImageObject can be set to accept similarity_pattern in percentages and by doing so you'll be able to precisely find what you are looking for.

Also another big advantage of this tool is that you can learn basics in one day.

Hope this helps.

Matching image to images collection

5 Answers5

Linked