I am trying to use inspect(TermDocumentMatrix())
to get a list of word/term frequencies between text documents (in R)
Using the example code from ?TermDocumentMatrix
:
data("crude")
tdm <- TermDocumentMatrix(crude, control = list(removePunctuation = TRUE,
stopwords = TRUE))
dtm <- DocumentTermMatrix(crude, control = list(weighting = function(x)
weightTfIdf(x, normalize = stopwords = TRUE)))
Now, I can inspect these:
inspect(tdm[1:1000, 1:5])
Results in:
<<TermDocumentMatrix (terms: 1000, documents: 5)>>
Non-/sparse entries: 322/4678
Sparsity : 94%
Maximal term length: 16
Weighting : term frequency (tf)
Sample :
Docs
Terms 127 144 191 194 211
crude 2 0 2 3 0
demand 0 5 0 0 0
dlrs 2 0 1 2 2
mln 0 4 0 0 2
oil 5 12 2 1 1
opec 0 13 0 0 0
price 2 1 2 2 0
prices 3 5 0 0 0
production 0 6 0 0 0
said 3 11 1 1 3
However, I want a longer list of terms... How can I get this?
I've tried myinspection = inspect(tdm[1:1000, 1:5])
, but it doesn't get me anywhere