I have term document matrix
before and want to add new document
to that term document matrix
, in another way it can say to join two document matrix.
My term document matrix is :
Docs
Term 1
eat 7
food 2
run 2
sick 3
Then another document is watch football match and eat food
After the process, i want my term document matrix to be :
Docs
Term 1 2
eat 7 1
food 2 1
run 2 0
sick 3 0
watch 0 1
football 0 1
match 0 1
and 0 1
I've tried this :
library("SnowballC")
library("NLP")
library("tm")
library("lsa")
#mytermdm (term document matrix i have before)
text2 <- "watch fottball match and eat food"
myCorpus <- Corpus(VectorSource(text2))
tdm2 <- TermDocumentMatrix(myCorpus, control = list
(removeNumbers = TRUE,
removePunctuation = TRUE,
stopwords=stopwords_en,
stemming=TRUE)
)
mytdm3 <- c(mytermdm,tdm2)
inspect(mytdm3)
I get this :
TermDocumentMatrix (terms: 7, document:2)
Error in `[.simple_triplet_matrix`(x,terms,doc)`
Repeated indices currently no allowed.