9

I have a database of images. When I take a new picture, I want to compare it against the images in this database and receive a similarity score (using OpenCV). This way I want to detect, if I have an image, which is very similar to the fresh picture.

Is it possible to create a fingerprint/hash of my database images and match new ones against it?

I'm searching for a alogrithm code snippet or technical demo and not for a commercial solution.

Best,

Stefan

Stefan
  • 28,843
  • 15
  • 64
  • 76
  • 2
    Normally this kind of thing is done by extracting feature vectors from the images and doing some kind of template matching on the feature vectors. – Paul R Aug 26 '11 at 13:40
  • There are a few similar/related/duplicate questions: [OpenCV / SURF How to generate a image hash / fingerprint / signature out of the descriptors?](http://stackoverflow.com/questions/7205489/opencv-fingerprint-image-and-compare-against-database), [Near-Duplicate Image Detection](http://stackoverflow.com/questions/1034900/near-duplicate-image-detection/), [Image fingerprint to compare similarity of many images](http://stackoverflow.com/questions/596262/image-fingerprint-to-compare-similarity-of-many-images), ... – Albert Jan 15 '13 at 08:47

3 Answers3

10

As Pual R has commented, this "fingerprint/hash" is usually a set of feature vectors or a set of feature descriptors. But most of feature vectors used in computer vision are usually too computationally expensive for searching against a database. So this task need a special kind of feature descriptors because such descriptors as SURF and SIFT will take too much time for searching even with various optimizations.

The only thing that OpenCV has for your task (object categorization) is implementation of Bag of visual Words (BOW).

It can compute special kind of image features and train visual words vocabulary. Next you can use this vocabulary to find similar images in your database and compute similarity score.

Here is OpenCV documentation for bag of words. Also OpenCV has a sample named bagofwords_classification.cpp. It is really big but might be helpful.

Andrey Kamaev
  • 29,582
  • 6
  • 94
  • 88
  • 1
    Andrey, ho do you do this : "It can compute special kind of image features and train visual words vocabulary. Next you can use this vocabulary to find similar images in your database and compute similarity score." Thank – lilouch Aug 18 '14 at 11:37
2

Content-based image retrieval systems are still a field of active research: http://citeseerx.ist.psu.edu/search?q=content-based+image+retrieval

First you have to be clear, what constitutes similar in your context:

  1. Similar color distribution: Use something like color descriptors for subdivisions of the image, you should get some fairly satisfying results.
  2. Similar objects: Since the computer does not know, what an object is, you will not get very far, unless you have some extensive domain knowledge about the object (or few object classes). A good overview about the current state of research can be seen here (results) and soon here.

There is no "serve all needs"-algorithm for the problem you described. The more you can share about the specifics of your problem, the better answers you might get. Posting some representative images (if possible) and describing the desired outcome is also very helpful.

This would be a good question for computer-vision.stackexchange.com, if it already existed.

bjoernz
  • 3,852
  • 18
  • 30
0

You can use pHash Algorithm and store phash value in Database, then use this code:

double const mismatch = algo->compare(image1Hash, image2Hash);

Here 'mismatch' value can easly tell you the similarity ratio between two images.

pHash function:

  1. AverageHash
  2. PHASH
  3. MarrHildrethHash
  4. RadialVarianceHash
  5. BlockMeanHash
  6. BlockMeanHash
  7. ColorMomentHash

These function are well Enough to evaluate Image Similarities in Every Aspects.

Bojan B
  • 2,091
  • 4
  • 18
  • 26
  • after getting the results from the compare function, how are you supposed to know the degree by which they are similar or different? – werber bang May 16 '20 at 22:48