1

I've used sklearn's cosine_similarity function before, which receives a matrix and returns a matrix where m[i,j] represents the similarity of element i to element j. I need to compute the the cosine similarity of a single vector to every row in a matrix. Is there ab easy way to do this?

My desired output is a vector where each element, i, represents the similarity between the vector and matrix row i.

For additional context, my matrix has over 4 million rows and so when I tried cosine_similarity(matrix) the error returned is MemoryError: Unable to allocate 173. TiB for an array with shape (4872569, 4872569) and data type float64.

jbuddy_13
  • 902
  • 2
  • 12
  • 34
  • This answer might help you: https://stackoverflow.com/questions/40900608/cosine-similarity-on-large-sparse-matrix-with-numpy – Carlos Melus Feb 05 '21 at 19:44

0 Answers0