Error while using stemCompletion in tm package

Question

I am learning text mining in R and while running the code reproduced below - I am persistently getting the following error: inherits(doc, "TextDocument") is not TRUE

I recognize that the function DocumentTermMatrix requires a text/character data and somehow, after the step where stemCompletion part is called viz. the command

myCorpus <- tm_map(myCorpus, stemCompletion, dictionary = dictCorpus)

something is going wrong somewhere.

Any help would be appreciated. I have tried to look into other similar questions and answers posted for them, but I am not hitting at the right solution.

> library(twitteR) 
> library(tm)

> rdmTweets <- userTimeline("rdatamining", n=100)
> 
> df <- do.call("rbind", lapply(rdmTweets, as.data.frame))

> myCorpus <- Corpus(VectorSource(df$text)) 
> myCorpus <- tm_map(myCorpus, content_transformer(tolower))
> myCorpus <- tm_map(myCorpus, removePunctuation)
> myCorpus <- tm_map(myCorpus, removeNumbers)
> 
> myStopwords <- c(stopwords("english"), "available", "via") 
> idx <- which(myStopwords == "r") 
> myStopwords <- myStopwords[-idx] 
> myCorpus <- tm_map(myCorpus, removeWords, myStopwords)
> 
> dictCorpus <- myCorpus 
> myCorpus <- tm_map(myCorpus, stemDocument)
> myCorpus <- tm_map(myCorpus, stemCompletion, dictionary = dictCorpus)

> DocumentTermMatrix(myCorpus)

> Error: inherits(doc, "TextDocument") is not TRUE

I am using windows 7 - 64bit and R 3.1.0 with R Studio 0.98.507 — Stats_Lover, Jul 09 '14 at 17:44
See http://stackoverflow.com/questions/24191728/documenttermmatrix-error-on-corpus-argument — , Aug 05 '14 at 15:01

score 0 · Answer 1 · answered Feb 10 '16 at 06:24

0

This Will Solve your problem.

myCorpus <- tm_map(myCorpus, content_transformer(stemCompletion), dictionary = myCorpusCopy, lazy=TRUE)

answered Feb 10 '16 at 06:24

Partha Roy

1,575
15
16

Error while using stemCompletion in tm package

1 Answers1