Say I have a matrix mat
an 100 x 200
array.
My question is twofold:
How can I compute the cosine similarity of the first row against all the other rows? I tried using
sklearn
'scosine_similarity
function but passing in a100 x 200
matrix gives me a100 x 100
array (instead of a100 x 1
array).If I wanted to compute the cosine similarities of all the rows against the others, say compute all 100 C 2 = 4950 different combinations of all the rows, would it be fastest not to use something like
sklearn
but actually store the norms of each of the rows bynp.linalg.norm
and then compute each similarity bycos_sim = dot(a, b)/(norm(a)*norm(b))
?