I am learning text mining in R and while running the code reproduced below - I am persistently getting the following error: inherits(doc, "TextDocument") is not TRUE
I recognize that the function DocumentTermMatrix requires a text/character data and somehow, after the step where stemCompletion part is called viz. the command
myCorpus <- tm_map(myCorpus, stemCompletion, dictionary = dictCorpus)
something is going wrong somewhere.
Any help would be appreciated. I have tried to look into other similar questions and answers posted for them, but I am not hitting at the right solution.
> library(twitteR)
> library(tm)
> rdmTweets <- userTimeline("rdatamining", n=100)
>
> df <- do.call("rbind", lapply(rdmTweets, as.data.frame))
> myCorpus <- Corpus(VectorSource(df$text))
> myCorpus <- tm_map(myCorpus, content_transformer(tolower))
> myCorpus <- tm_map(myCorpus, removePunctuation)
> myCorpus <- tm_map(myCorpus, removeNumbers)
>
> myStopwords <- c(stopwords("english"), "available", "via")
> idx <- which(myStopwords == "r")
> myStopwords <- myStopwords[-idx]
> myCorpus <- tm_map(myCorpus, removeWords, myStopwords)
>
> dictCorpus <- myCorpus
> myCorpus <- tm_map(myCorpus, stemDocument)
> myCorpus <- tm_map(myCorpus, stemCompletion, dictionary = dictCorpus)
> DocumentTermMatrix(myCorpus)
> Error: inherits(doc, "TextDocument") is not TRUE