-1

Using the R wordcloud and tm packages for the first time, following this:

rwordcloud

As you can see below, I'm getting two strange errors in my output: it's giving partial words sometimes (busi, peopl, everi), and it's counting contractions as their own words ('ll, 're).

Any suggestions on how I can resolve this?

enter image description here

bhantol
  • 9,368
  • 7
  • 44
  • 81
David Gerrard
  • 161
  • 1
  • 9
  • 1
    You should share the code you used to generate this plot. Did you stem the words? Unless you create a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) we can't really help you. – MrFlick Jun 30 '15 at 14:08

2 Answers2

0

A reproducible example would really help. Nevertheless, I might give a hint that is hopefully useful. If your word list is stored in my_words, it could help if you add

my_words <- tm_map(my_words,content_transformer(removePunctuation))

in the code. This may at least remove the occurrences of "'ll" and "'re". Maybe....

RHertel
  • 23,412
  • 5
  • 38
  • 64
0

Thank you for your assistance.

Looks like this is to do with stemming, as once I removed the stemming aspects of the code, everything has worked out fine.

This got me to where I want to be, now I just need to start looking through and seeing which words I do in fact want to stem.

Text-mining with the tm-package - word stemming

Community
  • 1
  • 1
David Gerrard
  • 161
  • 1
  • 9