I am trying to scale the "matching_to_many_images.cpp" for large set of images. (20K+ images) https://github.com/kipr/opencv/blob/master/samples/cpp/matching_to_many_images.cpp
I use FLANN based matcher to match the images(with SURF keypoint and descriptor extractors).I try follow the method described in this paper (section "Searching on a Compute Cluster") http://www.cs.ubc.ca/research/flann/uploads/FLANN/flann_pami2014.pdf ,
I have a training image set C with total n number of images.
C={B(1)...B(n)}.
I divide the set C into N number of "buckets" where each bucket contains (n/N) number of images. For each bucket I perform "detectKeyPoints" , "computeDescriptors" and "trainMatcher" separately. This means I have a separate "DescriptorMatcher" for each image-bucket.Total N number of DescriptorMatchers.
Then for the query image, I perform "detectKeyPoints","computeDescriptors" and then perform "match" against each of the N DescriptorMatchers.
Finally receive DMatch list from each DescriptorMatcher, map local-image-bucket-indices to the global-image-index and calculated the number of matching-descriptors per image.Larger this number the closest to the query image.
I ran this with N=1 which gives the correct result. But when I increase the N (>1) I noticed that I didnt get the correct matching result.
My questions are:
1) Am I doing the correct steps according to the paper? I am trying to understand how the "reduce" step is done as described in the paper.
2) There are two factors I can extract from the DMatch object ; "distance" and "number of total matches per image". How can I use these two factors to find the closest matching image?