In general one cannot do this. In general (but not in this case), the computation time will not change because the work is physically there and exists and nomatter the accounting tricks one might try to rewrite it (a single for loop that acts like two for loops, or a recursive implementation), at the end of the day the work must still be done irrespective of the ordering you do it.
Is there a way to get rid of the for loops? Even if it slows down the run time a bit. – Cat_Smithyyyy03
Your case is special. While one normally has no reason to consider a speedup is possible, you are asking to consider tensor products, or in this case a matrix product. When you think about general matrix multiplication AB=C, a matrix product C is basically the outer product {a dot b} of all vectors a in the rowspace of A and b in the colspace of B.
So, for your case, first normalize all your vectors (so the normalization is not required later), then stack them to form A, then let B=transp(A), then do a matrix multiply.
[[a0 a1 a2] [[a0 [b0 [z0 [aa ab ac .. az
[b0 b1 b2] a1 b1 ... z1 ba bb bc .. bz
( . )( a2] b2] z2]] ) = ( . . )
. . .
. . .
[z0 z1 z2]] za zb zc zz]
Interestingly, now you can then plug this into a fast matrix-multiply algorithm, which is actually faster than O(dim * #vecs^2). Hopefully it's also optimized for self-transpose matrices that would generate symmetric matrices (which might save a factor of 2 work... maybe it has some flag like matrixmult(a,b,outputWillBeSymmetric)).
This is faster than "O(N^2)", unintuitively: This rewrite exposed a substructure in the problem, which can be leveraged to get faster than O(dim * #vecs^2). The leveragable substructure is namely the fact that you are computing the outer product of THE SAME vectors. The fast matrix-multiply algorithm will leverage this.
edit: my original answer was wrong
You have a set of size N and you wish to compute f(a,b) for all a and b in the set.
Unless you know some values are trivial, there is no way to be asymptotically faster than this, because you have to imagine the worst case: Every pair f(a,b) may be unique... so there's no way to do less than roughly N^2 work.
However since your function f is symmetric, you could do half the work, then duplicate it:
N = len(vectors)
for i in range(N):
for v in range(N):
dissim = #...
s[i,v] = dissim
s[v,i] = dissim
(You can avoid calculating your metric in the reflexive case f(a,a) because it's trivial, but that doesn't make things asymptotically faster since the fraction of such work N/N^2 tends to zero as N increases, so it's not that great an optimization... it is reasonable only if you are working with a very small number of vectors.)
Whether you should optimize further depends on whether you need to. Such code should be able to easily handle millions of small vectors. Next steps:
- Is there something fishy that is making my code very slow? What could it be? We don't have enough information to comment.
- If there is nothing fishy of the sort, you can try rephrasing as matrix operations, in order to stay as much as possible in numpy's optimized C routines instead of bouncing back and forth into python. This is ugly and I would avoid doing it, because your code readability will decrease.
- If you are dealing with hundreds of millions of vectors, perhaps consider a more cache-friendly approach where you do
for blockI in range(N//10**6): for blockV in range(N//10**6): for i in range(blockI*10**6, (blockI+1)*10**6): for v in range(blockV*...):
- If you're dealing with billions of vectors, then look into leveraging gpgpu. This is quite ideal for the gpu and might be a factor of thousands speedup.