I have been trying to execute TermDocumentMatrix
function on my corpus of texts but R and R Studio gave me an
Error: Error in tdm(txt, isTRUE(control$removePunctuation), isTRUE(control$removeNumbers), : function 'Rcpp_precious_remove' not provided by package 'Rcpp'.
After solving this error by update.packages('Rcpp') library(Rcpp)
both R and R studio stopped continuing and I have to end R session and R and couldn't proceed to perform wordcloud
followed by TermDocumentMatrix
function.
I also googled and search this problem here and read several codes debugging
including TermDocumentMatrix errors in R and R-Project no applicable method for 'meta' applied to an object of class "character" but I am still stuck in this code and cannot proceed to provide an amazing visual representation by wordcloud
and hist
for most frequent words in my corpus.
I really appreciate any kind of help in this regard.
Here is my entire code:
#installing tm package
library(tm)
#loading required package: NLP
#Create Corpus
docs <- Corpus( DirSource('C:/Users/x/Desktop/TextMiningR/Mix22'))
inspect(docs)
#start pre-processing
toSpace <- content_transformer(function(x, pattern) { return (gsub(pattern, " ",x))})
docs <- tm_map(docs, toSpace, "-")
docs <- tm_map(docs, toSpace, ":")
docs <- tm_map(docs, toSpace, "'")
docs <- tm_map(docs, toSpace, " -")
docs <- tm_map(docs, toSpace, "'")
#remove punctuation
docs <- tm_map(docs, removePunctuation)
#transfer to lowercase
docs <- tm_map(docs, content_transformer(tolower))
#strip digits
docs <- tm_map(docs, removeNumbers)
#remove stopwords from standard stopword list
docs <- tm_map(docs, removeWords, stopwords("english"))
#strip whitespace
docs <- tm_map(docs, stripWhitespace)
#inspect output
inspect(docs)
library(Rcpp)
#create document term matrix : the following line of code stops executing:
dtm <- DocumentTermMatrix(docs)
#or
dtm <- as.matrix(TermDocumentMatrix(docs))