I have a set of 30 000 documents represented by vectors of floats. All vectors have 100 elements. I can find similarity of two documents by comparing them using cosine measure between their vectors. The problem is that it takes to much time to find the most similar documents. Is there any algorithm which can help me with speeding up this?
EDIT
Now, my code just counts cosine similarity between first and all others vectors. It takes about 3 sec. I would like to speed it up ;) algorithm doesn't have to be accurate but should give similar results to full search.
Sum of elements of each vector is equal 1.
start = time.time()
first = allVectors[0]
for vec in allVectors[1:]:
cosine_measure(vec[1:], first[1:])
print str(time.time() - start)