Say I have a set of 10,000 images that I'd like to classify based on similarity. A number of people have recommended that comparing histograms is a cheap way to measure similarity. This thread, for example, recommends using 6 histograms for each comparison.
If I compare each image's histogram with all other images in the set, that's O(n^2) = 60,000*59,999/2 comparisons in all, which is very slow. How can I speed this up?