1

Hello everyone and thanks for your help. I am using R 3.1.3 to create a comparison cloud and i will like to group certain words under one word. Example: easier and ease will be grouped under easy. I am using the tm package and here is my code, but it doesn't work. I still see the words i am trying to convert.

lines.corp <- Corpus(VectorSource(all)) #converting data frame to a Corpus#
########################################################################
lines.corp2 <- lines.corp
toString <- content_transformer(function(x, from, to) gsub(from, to, x))
lines.corp2 <- tm_map(lines.corp2, toString, "ease", "easy")
lines.corp2 <- tm_map(lines.corp2, toString, "easier", "easy")
lines.corp2 <- tm_map(lines.corp2, toString, "convenience", "convenient")
lines.corp2 <- tm_map(lines.corp, stripWhitespace)
lines.corp2 <- tm_map(lines.corp2, removeNumbers)
lines.corp2 <- tm_map(lines.corp2, removePunctuation)
lines.corp2 <- tm_map(lines.corp2, removeWords, stopwords('english'))
user3117087
  • 11
  • 1
  • 1
  • 5
  • 2
    Please create a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input so we can run and test the code ourselves. It seems like your problem is the line `lines.corp2 <- tm_map(lines.corp, stripWhitespace)` where you go back to use the `lines.corp` variable, ignoring all the changes you made to the `lines.corp2` variable. This is a silly typo. – MrFlick Jul 31 '15 at 18:54
  • Thanks for catching my typo MrFlick. I reran my script and unfortunately that's not the issue. I am very new to R and stackoverflow. The documentation on reproducible example is confusing to me. Can't i just post my code and the .txt file i'm working with? – user3117087 Jul 31 '15 at 19:13
  • 2
    I think you're looking for stemming, which `tm` can do. Try adding `lines.corp2 <- tm_map(lines.corp2, stemDocument)` at the end of that series of lines and see if that does the trick. – ulfelder Jul 31 '15 at 19:19
  • If I set `all<-c("With great ease","This test is easier")` and then run this code (with the typo fixed), and then run `content(lines.corp2[[1]])`, i get `"With great easy"` and `content(lines.corp2[[2]])`, returns `"This test easy"` so it appears to be working fine. – MrFlick Jul 31 '15 at 19:26

0 Answers0