0

Say I have a set of 10,000 images that I'd like to classify based on similarity. A number of people have recommended that comparing histograms is a cheap way to measure similarity. This thread, for example, recommends using 6 histograms for each comparison.

If I compare each image's histogram with all other images in the set, that's O(n^2) = 60,000*59,999/2 comparisons in all, which is very slow. How can I speed this up?

Community
  • 1
  • 1

1 Answers1

0

Hash the histogram in some way,make a sorted list of the hashes, find adjacent values that are similar (within some limit) then compare those histograms

However making the histograms is likely to be the slow step

Martin Beckett
  • 94,801
  • 28
  • 188
  • 263