0

I'm wondering how to perform text similarity using R from a dataframe. I have the below code that works perfectly when I directly input what to compare, but I'm struggling to get it to work with words contained in my dataframe. I wanna compare all pairs of words in my dataframe. Any ideas? Thanks in advance.

library(textcat)

?textcat_xdist

round(textcat_xdist(
list(
   text1="hello there",
   text2="why hello there",
   text3="totally different"
   ),
 method="cosine"),
3)


Data <- data.frame(
  X = sample(1:4),
  Word = sample(c("hello", "hellow", "hellloooo", "different"), 4, replace = TRUE)
)
  • Please show us (a portion of) your data.frame. – r2evans Apr 19 '15 at 16:12
  • 1
    We need an example that [reproduces](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) the problem you are having, not one that works. What specifically do you mean when you say "can't get it to work"? – MrFlick Apr 19 '15 at 16:14
  • Thanks for the reply. I've added more details to my question. – user3132770 Apr 19 '15 at 19:46

0 Answers0