I have a very large csv file (about 91 million rows so a for loop takes too long in R) of similarities between keywords (about 50,000 unique keywords) that when I read into a data.frame looks like:
> df
kwd1 kwd2 similarity
a b 1
b a 1
c a 2
a c 2
It is a sparse list and I can convert it into a sparse matrix using sparseMatrix():
> myMatrix
a b c
a . 1 2
b 1 . .
c 2 . .
However, now I would like to convert this into a dist object. I tried as.dist(myMatrix) but I was given the error that the 'problem was too large' for as.dist(). I also tried converting the sparse matrix to a lower triangular sparse matrix then to a dist object (thinking this might be better) using myMatrix = myMatrix * lower.tri(myMatrix), but I then had the same error but with regard to the lower.tri function.
Thanks for any help!