0

I'm having issues with my LDA model in R. Everytime I try to execute the tidy() function on my LDA_VEM object I get the error "Error: binding not found: 'Var1'. Could you please explain how to remedy this my code is below:

why <-read.csv("FakeDoc.csv", header = FALSE, na.strings = "")
why.char <- data_frame(text=as.character(why$V1))
why.char <- why.char %>%
  mutate(document = row_number())
why.tidy <- why.char %>%
  unnest_tokens(word, text)
why.tidy <- why.tidy %>%
  anti_join(stop_words)
why.tidy <- why.tidy %>%
  filter(!str_detect(word,"[0-9]"))

  #Frequency Table
why.doc <- why.tidy %>%
  count(document, word, sort = TRUE) %>%
  ungroup()
why.words <- why.doc %>%
  group_by(document) %>%
  summarize(total = sum(n))
why.ft <- left_join(why.doc, why.words)
grams1_united <- why.ft[c("document", "word", "total")] 

  #N-grams
tidy.n2 <- why.char %>%
  unnest_tokens(ngram, text, token = "ngrams", n=2)
tidy.n3 <- why.char %>%
  unnest_tokens(ngram, text, token = "ngrams", n=3)

tidy.n2 <- tidy.n2 %>%
  filter(!str_detect(ngram, "[0-9]"))
tidy.n3 <- tidy.n3 %>%
  filter(!str_detect(ngram, "[0-9]"))

tidy.n2 %>%
  count(ngram, sort = TRUE)
tidy.n3 %>%
  count(ngram, sort = TRUE)

grams2_seperated <- tidy.n2 %>%
  separate(ngram, c("word1", "word2"), sep = " ")
grams2_filtered <- grams2_seperated %>%
  filter(!word1 %in% stop_words$word) %>%
  filter(!word2 %in% stop_words$word)
gram2_counts <- grams2_filtered %>%
  count(word1, word2, sort = TRUE)
grams2_united <- grams2_filtered %>%
  unite(ngram, word1, word2, sep = " ")
grams2_united <- grams2_united %>%
  group_by(document) %>%
  count(ngram, sort = TRUE)
grams2_united

grams3_seperated <- tidy.n3 %>%
  separate(ngram, c("word1", "word2", "word3"), sep = " ")
grams3_filtered <- grams3_seperated %>%
  filter(!word1 %in% stop_words$word) %>%
  filter(!word2 %in% stop_words$word) %>%
  filter(!word3 %in% stop_words$word)
gram3_counts <- grams3_filtered %>%
  count(word1, word2, word3, sort = TRUE)
grams3_united <- grams3_filtered %>%
  unite(ngram, word1, word2, word3, sep = " ")
grams3_united <- grams3_united %>%
  group_by(document) %>%
  count(ngram, sort = TRUE)

colnames(grams2_united) <- c("document", "word", "total")
colnames(grams3_united) <- c("document", "word", "total")

  #DTM
grams1_united
grams2_united
grams3_united
detractorwhy.tots <- rbind.data.frame(grams1_united, grams2_united, grams3_united)
dtwtots <- as.data.frame(detractorwhy.tots)
dtw.dtm <- dtwtots %>%
  cast_dtm(document, word, total)
dtw_5lda <- LDA(dtw.dtm,control = list(alpha = 0.05), k = 5)
topics <- tidy(dtw_5lda)
  • That particular error ("Error: binding not found...") means that a column the `tidy()` function is expecting is not there. I cannot say I've seen that before and can't tell exactly what is going on from this much information. – Julia Silge Apr 23 '17 at 17:16
  • Can you make a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? I know this is tough in this situation because the data for topic modeling tends to be somewhat complicated, but it would help us get to the bottom of why you are seeing this. If you want, you could use the [data we have for the topic modeling vignette](https://github.com/juliasilge/tidytext/blob/master/vignettes/topic_modeling.Rmd) and see if you see the error using that data as well. Or compare the two document-term matrices, or the LDA output, etc etc etc. – Julia Silge Apr 23 '17 at 17:17

1 Answers1

0

Had the same error when running tidy on LDA object. Eventually figured out there was some kind of conflict between the libraries I was loading: topicmodels vs reshape2. The error disappeared after I stopped import the reshape2 library.

Mike
  • 71
  • 9