Text similarity in R

Asked Apr 19 '15 at 16:10

Active Apr 19 '15 at 19:49

Viewed 780 times

I'm wondering how to perform text similarity using R from a dataframe. I have the below code that works perfectly when I directly input what to compare, but I'm struggling to get it to work with words contained in my dataframe. I wanna compare all pairs of words in my dataframe. Any ideas? Thanks in advance.

library(textcat)

?textcat_xdist

round(textcat_xdist(
list(
   text1="hello there",
   text2="why hello there",
   text3="totally different"
   ),
 method="cosine"),
3)


Data <- data.frame(
  X = sample(1:4),
  Word = sample(c("hello", "hellow", "hellloooo", "different"), 4, replace = TRUE)
)

edited Apr 19 '15 at 19:49

asked Apr 19 '15 at 16:10

user3132770

Please show us (a portion of) your data.frame. – r2evans Apr 19 '15 at 16:12
1

We need an example that [reproduces](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) the problem you are having, not one that works. What specifically do you mean when you say "can't get it to work"? – MrFlick Apr 19 '15 at 16:14
Thanks for the reply. I've added more details to my question. – user3132770 Apr 19 '15 at 19:46

Text similarity in R

0 Answers0