1

I have some conceptual issues in understanding SURF and SIFT algorithm All about SURF. As far as my understanding goes SURF finds Laplacian of Gaussians and SIFT operates on difference of Gaussians. It then constructs a 64-variable vector around it to extract the features. I have applied this CODE.

(Q1 ) So, what forms the features?

(Q2) We initialize the algorithm using SurfFeatureDetector detector(500). So, does this means that the size of the feature space is 500?

(Q3) The output of SURF Good_Matches gives matches between Keypoint1 and Keypoint2 and by tuning the number of matches we can conclude that if the object has been found/detected or not. What is meant by KeyPoints ? Do these store the features ?

(Q4) I need to do object recognition application. In the code, it appears that the algorithm can recognize the book. So, it can be applied for object recognition. I was under the impression that SURF can be used to differentiate objects based on color and shape. But, SURF and SIFT find the corner edge detection, so there is no point in using color images as training samples since they will be converted to gray scale. There is no option of using colors or HSV in these algorithms, unless I compute the keypoints for each channel separately, which is a different area of research (Evaluating Color Descriptors for Object and Scene Recognition).

So, how can I detect and recognize objects based on their color, shape? I think I can use SURF for differentiating objects based on their shape. Say, for instance I have a 2 books and a bottle. I need to only recognize a single book out of the entire objects. But, as soon as there are other similar shaped objects in the scene, SURF gives lots of false positives. I shall appreciate suggestions on what methods to apply for my application.

Srishti M
  • 533
  • 4
  • 21
  • [This][1] post will most certainly help you [1]:http://stackoverflow.com/questions/10168686/algorithm-improvement-for-coca-cola-can-shape-recognition/10219338#10219338 – Darshan Nov 08 '13 at 09:39

1 Answers1

3
  1. The local maxima (response of the DoG which is greater (smaller) than responses of the neighbour pixels about the point, upper and lover image in pyramid -- 3x3x3 neighbourhood) forms the coordinates of the feature (circle) center. The radius of the circle is level of the pyramid.

  2. It is Hessian threshold. It means that you would take only maximas (see 1) with values bigger than threshold. Bigger threshold lead to the less number of features, but stability of features is better and visa versa.

  3. Keypoint == feature. In OpenCV Keypoint is the structure to store features.

  4. No, SURF is good for comparison of the textured objects but not for shape and color. For the shape I recommend to use MSER (but not OpenCV one), Canny edge detector, not local features. This presentation might be useful

old-ufo
  • 2,799
  • 2
  • 28
  • 40
  • Which z? First, everything is done by feature detector. Second, SURF feature is oriented circle. It have parameters: x,y, scale (radius), angle(orientation). – old-ufo Nov 08 '13 at 18:49
  • :Sorry for again bothering you, but I have some doubts popping up. (1)The (x,y) that you mentioned - are these the x,y coordinates of the features or the coordinates of the pixels?(2)The feature set is a 64 dimensional vector according to the documentation.But, you mentioned it consists of 4 variables?I still have doubts as to what constitutes the features.Will be extremely helpful if you could kindly elaborate your answer – Srishti M Nov 22 '13 at 23:29
  • 1
    1. Coordinates of the pixel. 2. You mix two things - feature (which is oriented circle == geometrical shape, not in image, but kind of selection) and feature descriptor (which is 64-d vector, describing image patch, within mentioned circle.) – old-ufo Nov 23 '13 at 09:15