1

I am trying to use OpenCV's feature detection tools in order to decide whether a small sample image exists in a larger scene image or not.
I used the code from here as a reference (without the homography part).

UIImage *sceneImage, *objectImage1;
cv::Mat sceneImageMat, objectImageMat1;
cv::vector<cv::KeyPoint> sceneKeypoints, objectKeypoints1;
cv::Mat sceneDescriptors, objectDescriptors1;
cv::SurfFeatureDetector *surfDetector;
cv::FlannBasedMatcher flannMatcher;
cv::vector<cv::DMatch> matches;
int minHessian;
double minDistMultiplier;

minHessian = 400;
minDistMultiplier= 3;
surfDetector = new cv::SurfFeatureDetector(minHessian);

sceneImage = [UIImage imageNamed:@"twitter_scene.png"];
objectImage1 = [UIImage imageNamed:@"twitter.png"];

sceneImageMat = cv::Mat(sceneImage.size.height, sceneImage.size.width, CV_8UC1);
objectImageMat1 = cv::Mat(objectImage1.size.height, objectImage1.size.width, CV_8UC1);

cv::cvtColor([sceneImage CVMat], sceneImageMat, CV_RGB2GRAY);
cv::cvtColor([objectImage1 CVMat], objectImageMat1, CV_RGB2GRAY);

if (!sceneImageMat.data || !objectImageMat1.data) {
    NSLog(@"NO DATA");
}

surfDetector->detect(sceneImageMat, sceneKeypoints);
surfDetector->detect(objectImageMat1, objectKeypoints1);

surfExtractor.compute(sceneImageMat, sceneKeypoints, sceneDescriptors);
surfExtractor.compute(objectImageMat1, objectKeypoints1, objectDescriptors1);

flannMatcher.match(objectDescriptors1, sceneDescriptors, matches);

double max_dist = 0; double min_dist = 100;

for( int i = 0; i < objectDescriptors1.rows; i++ )
{ 
    double dist = matches[i].distance;
    if( dist < min_dist ) min_dist = dist;
    if( dist > max_dist ) max_dist = dist;
}

cv::vector<cv::DMatch> goodMatches;
for( int i = 0; i < objectDescriptors1.rows; i++ )
{ 
    if( matches[i].distance < minDistMultiplier*min_dist )
    { 
        goodMatches.push_back( matches[i]);
    }
}
NSLog(@"Good matches found: %lu", goodMatches.size());

cv::Mat imageMatches;
cv::drawMatches(objectImageMat1, objectKeypoints1, sceneImageMat, sceneKeypoints, goodMatches, imageMatches, cv::Scalar::all(-1), cv::Scalar::all(-1),
                cv::vector<char>(), cv::DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS);

for( int i = 0; i < goodMatches.size(); i++ )
{
    //-- Get the keypoints from the good matches
    obj.push_back( objectKeypoints1[ goodMatches[i].queryIdx ].pt );
    scn.push_back( objectKeypoints1[ goodMatches[i].trainIdx ].pt );
}

cv::vector<uchar> outputMask;
cv::Mat homography = cv::findHomography(obj, scn, CV_RANSAC, 3, outputMask);
int inlierCounter = 0;
for (int i = 0; i < outputMask.size(); i++) {
    if (outputMask[i] == 1) {
        inlierCounter++;
    }
}
NSLog(@"Inliers percentage: %d", (int)(((float)inlierCounter / (float)outputMask.size()) * 100));

cv::vector<cv::Point2f> objCorners(4);
objCorners[0] = cv::Point(0,0);
objCorners[1] = cv::Point( objectImageMat1.cols, 0 );
objCorners[2] = cv::Point( objectImageMat1.cols, objectImageMat1.rows );
objCorners[3] = cv::Point( 0, objectImageMat1.rows );

cv::vector<cv::Point2f> scnCorners(4);

cv::perspectiveTransform(objCorners, scnCorners, homography);

cv::line( imageMatches, scnCorners[0] + cv::Point2f( objectImageMat1.cols, 0), scnCorners[1] + cv::Point2f( objectImageMat1.cols, 0), cv::Scalar(0, 255, 0), 4);
cv::line( imageMatches, scnCorners[1] + cv::Point2f( objectImageMat1.cols, 0), scnCorners[2] + cv::Point2f( objectImageMat1.cols, 0), cv::Scalar( 0, 255, 0), 4);
cv::line( imageMatches, scnCorners[2] + cv::Point2f( objectImageMat1.cols, 0), scnCorners[3] + cv::Point2f( objectImageMat1.cols, 0), cv::Scalar( 0, 255, 0), 4);
cv::line( imageMatches, scnCorners[3] + cv::Point2f( objectImageMat1.cols, 0), scnCorners[0] + cv::Point2f( objectImageMat1.cols, 0), cv::Scalar( 0, 255, 0), 4);

[self.mainImageView setImage:[UIImage imageWithCVMat:imageMatches]];

This works, but I keep getting a significant amount of matches, even when the small image is not part of the larger one.
Here's an example for a good output:
Good Output
And here's an example for a bad output:
Bad Output
Both outputs are the result of the same code. Only difference is the small sample image.
With results like this, it is impossible for me to know when a sample image is NOT in the larger image.
While doing my research, I found this stackoverflow question. I followed the answer given there, and tried the steps suggested in the "OpenCV 2 Computer Vision Application Programming Cookbook" book, but I wasn't able to make it work with images of different sizes (seems like a limitation of the cv::findFundamentalMat function).

What am I missing? Is there a way to use SurfFeatureDetector and FlannBasedMatcher to know when one sample image is a part of a larger image, and another sample image isn't? Is there a different method which is better for that purpose?

UPDATE:
I updated the code above to include the complete function I use, including trying to actually draw the homography. Plus, here are 3 images - 1 scene, and two small objects I'm trying to find in the scene. I'm getting better inlier percentages for the paw icon, and not the twitter icon, which is actually IN the scene. Plus, the homography is not drawn for some reason:
Twitter Icon
Paw Icon
Scene

Community
  • 1
  • 1
Darkshore Grouper
  • 693
  • 1
  • 9
  • 20

2 Answers2

3

Your matcher will always match every point from the smaller descriptor list to one of the larger list. You then have to look for yourself which of these matches make sense and which not. You can do this by discarding every match that exceeds a maximum allowed descriptor distance, or you can try to find a transformation matrix (e.g. with findHomography) and check if enough matches correspond to it.

Tobias Hermann
  • 9,936
  • 6
  • 61
  • 134
  • Thanks for the quick comment. I think I understand how to use findHomography to find the transformation, but how do I test a match to see if it corresponds the transformation? As for the distances, I tried messing around with that, but in the two examples I gave, I still can't seem to eliminate the matches from the bad example. – Darkshore Grouper Dec 10 '12 at 12:34
  • The last parameter of cv::findHomography is "OutputArray mask". This will give you the information about what points are inliers and what are outliers if you use CV_RANSAC or CV_LMEDS. – Tobias Hermann Dec 10 '12 at 12:51
  • Thanks! Using the output from findHomography works pretty well. You've been a great help. – Darkshore Grouper Dec 11 '12 at 08:52
  • Sorry, I thought it was working fine, but I conducted a few more tests, and i'm still getting inconclusive results. I'm calculating the homography function, and then I'm checking the percentage of inliners in the result mask. The percentages are sometimes very low for a good output, and very high for images that are not present in the large scene at all. Furthermore, I also tried drawing the homography, like done in the link I added, and it's not always being drawn. Maybe I'm doing something else wrong? – Darkshore Grouper Dec 11 '12 at 16:07
  • Can you please provide a complete (but minimal) source code and images to reproduce your problem? – Tobias Hermann Dec 11 '12 at 16:13
  • Sure. Will the complete function be enough, or should I create an xcode project and share it? – Darkshore Grouper Dec 11 '12 at 17:34
  • The function will be enough. I will test it on linux with C++ anyway. But if I get the same result there as you get, I think we can work like this. – Tobias Hermann Dec 11 '12 at 19:40
  • 1
    Ok, I completed the source code with the inlier percentage calculation, and the drawing of the homography. I also added links for test images. Thanks! – Darkshore Grouper Dec 12 '12 at 10:17
  • OK, first I have translated your source code to normal C++: http://codepad.org/mNGAWh8Q While doing this I found, that your line 'scn.push_back( objectKeypoints1[ goodMatches[i].trainIdx ].pt );' perhaps should have been 'scn.push_back( sceneKeypoints[ goodMatches[i].trainIdx ].pt );' , right? After changing this I got the following results: twitterjh.png Good matches found: 18 Inliers percentage: 61 pawb.png Good matches found: 21 Inliers percentage: 28 Together with the result images (you can find it here http://daiw.de/share/index.php?dir=StackOverflow%2F20121212%2F ) it looks OK to me. – Tobias Hermann Dec 12 '12 at 11:37
  • *sigh*....I can't believe that it all came down to a simple misunderstanding of what query and train are...I thought that even if I take the keypoints from the object, the train pts are still the points from the scene. You, sir, are a life saver. Thank you very much! – Darkshore Grouper Dec 12 '12 at 13:11
  • Glad I can help. goodMatches[i].trainIdx is just the index to access the items in the appropriate list of points. You app could also just have crashed with what you were doing first. ;) – Tobias Hermann Dec 12 '12 at 13:43
0

It's a old post , but from a similar assignment I had to do for class. A way to remove the bad output is to check that most of the matching lines are parallel(relatively) to each other, and remove matches that point in wrong directions.

Pita
  • 1,444
  • 1
  • 19
  • 29
  • This only works if both images are rotated and scaled equally. But in that case, one probably would not even need keypoints. and a simple template matching would do the trick. – Tobias Hermann May 23 '16 at 09:27