5

I am trying to use inspect(TermDocumentMatrix()) to get a list of word/term frequencies between text documents (in R)

Using the example code from ?TermDocumentMatrix:

data("crude")
tdm <- TermDocumentMatrix(crude, control = list(removePunctuation = TRUE, 
    stopwords = TRUE))
dtm <- DocumentTermMatrix(crude, control = list(weighting = function(x) 
    weightTfIdf(x, normalize = stopwords = TRUE)))

Now, I can inspect these:

inspect(tdm[1:1000, 1:5])

Results in:

<<TermDocumentMatrix (terms: 1000, documents: 5)>>
Non-/sparse entries: 322/4678
Sparsity           : 94%
Maximal term length: 16
Weighting          : term frequency (tf)
Sample             :
            Docs
Terms        127 144 191 194 211
  crude        2   0   2   3   0
  demand       0   5   0   0   0
  dlrs         2   0   1   2   2
  mln          0   4   0   0   2
  oil          5  12   2   1   1
  opec         0  13   0   0   0
  price        2   1   2   2   0
  prices       3   5   0   0   0
  production   0   6   0   0   0
  said         3  11   1   1   3

However, I want a longer list of terms... How can I get this?

I've tried myinspection = inspect(tdm[1:1000, 1:5]), but it doesn't get me anywhere

b_g
  • 299
  • 1
  • 4
  • 14
  • 3
    `inspect` gives you a sample as indicated. One way to get what you want: convert the subject to a matrix: `View(as.matrix(tdm[1:1000, 1:5]))`. – lukeA May 03 '17 at 00:03
  • 1
    I mean the subset* – lukeA May 03 '17 at 00:10
  • I just wanted to point out that examples for ldavis use inspect(dtm) in the json creation function. using @lukeA 's answer corrects that function. – ChristinaP Sep 02 '17 at 19:15

0 Answers0