Bag of Visual Words (obtained from features) for CBIR. Steps?

Question

I'm very confused about the steps to follow to use BOVW for CBIR. I found a lot of literature about classification, machine learning and SVM but it is not quite what I'm looking for.
My problem is related to searching image similarity in a database with an image query.

My steps until now:

extract features (example: ORB, BRISK, SIFT...).
store all images' features to disk.
read features and calculate K-means in order to obtain centroids (my vocabulary, right?)

And now I'm stuck. I found many different ways to proceed.

This is my hypothesis:

for each k-means compute nearest neighbour (FLANN?)
Build histogram with set of nearest neighbour

Do I have to extract a dictionary also for every single image and then indexing the images?
Why is vector quantization (step 4. and 5.) necessary?

Can you suggest me a possible way to proceed, or any article, tutorial on the topic?

NOTE: For the implementation of BOVW I cannot use OpenCV because it does not work with binary descriptors so I need to try with sklearn library.

Possible duplicate of [euclidean distance in sift](https://stackoverflow.com/questions/4357352/euclidean-distance-in-sift) — desertnaut, Feb 22 '18 at 15:31

score 0 · Accepted Answer · answered Feb 02 '18 at 14:48

0

Ok, this is pretty much what I was looking for:

https://stackoverflow.com/a/8549874/8894489

Hope that can be helpful for someone.

answered Feb 02 '18 at 14:48

Furin

532
10
31

Bag of Visual Words (obtained from features) for CBIR. Steps?

1 Answers1