Estimation of similarity score of two images based on extracted SIFT descriptors by Euclidean distance (or any other distance computational technique)

Question

I have calculated SIFT descriptors for Image A and B with python 2.7 using openCV python. Image A has 16X128 (=2048) descriptors and Image B has 10X128 (=1280). now I got stuck because I don't know how to generate a similarity score. I would appreciate you if you can help me.

score or similarity term is the measurement between a pair of matched descriptors (e.g. Euclidean distance) But comparing the SIFT descriptors themselves in an image to those of another is not feasible because you will end up with multiple SIFT descriptors in an image, and their number varies depending on how you extract them as I mentioned before image A has 16X128 (=2048) descriptors and the other has 1028.

In matlab VL-feat, SCORE has been implemented as follow:

[fa, da] = vl_sift(Image_a) ;
[fb, db] = vl_sift(Image_b) ;
[matches, scores] = vl_ubcmatch(da, db) ;

Finally, I want to calculate imposter and genuine scores and then I want to calculate EER.

I would like to draw your attention that I don't want to use any of following approaches:

VLfeat in Matlab
BoW (Bag of word) algorithm ( Euclidean distance in sift )
Answer in Interpreting score in SIFT

Thank you.

this is how I extracted SIFT keypoint and descriptors:

import cv2
def extractFeatures_SIFT(Imagelist):
    l = len(Imagelist)
    featurlist = []
    for img_path in Imagelist:
        img = img_path
        img = cv2.imread(img)
        sift = cv2.xfeatures2d.SIFT_create()
        (kps, descriptor) = sift.detectAndCompute(img, None)
        featurlist += [kps, descriptor]

    return featurlist

what is your general idea? How do you define "similarity" here? — Micka, Nov 29 '17 at 05:49
@Micka, I meant by similarity is the measurement between a pair of matched descriptors (e.g. Euclidean distance) But comparing the SIFT descriptors themselves in an image to those of another is not feasible because you will end up with multiple SIFT descriptors in an image, and their number varies depending on how you extract them as I mention for example image A has 16X128 (=2048) descriptors and the other has 1028. — GreenQuestioner, Nov 29 '17 at 13:24
@Micka , In matlab VL-feat, SCORE has been implemented as follow: ** [fa, da] = vl_sift(Image_a) ; [fb, db] = vl_sift(Image_b) ; [matches, scores] = vl_ubcmatch(da, db) ; ** Finally, I want to calculate imposter and genuine scores and then I want to calculate EER. — GreenQuestioner, Nov 29 '17 at 13:24

Sakthi Geek · Answer 1 · 2017-12-20T14:54:08.760

The number of SIFT descriptors in an image varies with different images as the key points detected across different images varies. In your case, Image A has 16 key points and Image B has 10 key points. A SIFT descriptor of 128 values is computed for each key point and so you get a total of 2048 and 1280 descriptor values respectively for the images A and B.

Note that 2048 and 1280 are not the number of descriptors but the number of values in the descriptors of images A and B. Image A has 16 descriptors and Image B has 10 descriptors. This difference in key points and descriptors are common as different images have a different number of interesting points that can get detected as key points.

This difference does not pose a problem for finding the similarity between them as when you pass the descriptors through a matching function like BFMatcher and FlannBasedMatcher, you get only the descriptor matches that only combine an equal number of descriptors from both the images. Usually, the length of the matches will be equal to the length of the shortest descriptor.(in your case, you will get 10 matches)

Next, from these 10 matches, you have to remove unnecessary and approximate matches by using crosscheck or ratio test as given by David G.Lowe and filter out only the good matches. Even then, you might have false positive matches. These can further be removed by using homography or any other custom method depending on the images and your application.

After all these processes, you will get final matches. You can use the number of final matches as a way to test the similarity between the two images by setting a threshold. If the number of final matches is above the set threshold, then the images are similar. If the number of final matches is less than the set threshold, then the images are different.

In your case, even at the start, you get only 10 matches to work with. So when you go through all the above processes and filter out the matches, you will be left with very few final matches with which you cannot set a reasonable threshold to get desired results. So you might have to increase the number of detected key points at the start itself.

This can be done by passing a lower value in the 'contrastThreshold'(default value = 0.04) and 'edgeThreshold'(default value=10) of the SIFT_create() function[for OpenCV 3]. You can also limit the number of keypoints by passing a suitable value in the 'nFeatures' parameter.

Alternatively, to increase the key points, you can try other algorithms like SURF, ORB,... to detect key points and then use those key points to compute SIFT descriptors for those key points

Hope my answer helps you.

score 0 · Answer 2 · answered Dec 20 '17 at 20:35

If you are trying to develop a dissimilarity function between images, you should probably be looking at global rather than local descriptors (SIFT is a local one). E.g., GIST or CENTRIST.

Bag of Words (which for some reason you are trying to eschew - why, actually?) can be seen as taking the same approach further (it constructs a global descriptor by learning the distribution of local ones), but it's also much more expensive and requires a training phase.

Estimation of similarity score of two images based on extracted SIFT descriptors by Euclidean distance (or any other distance computational technique)

2 Answers2