-1

I have a R dataframe that consists of two columns, id and text, and I want to turn it into a cooccurrence matrix of word pairs that appear together in the same id's list of words. So, this dataframe:

df <- data.frame(id = c(1, 1, 1, 2, 2, 2), text = c(but, the, and, but, a, the))

should be turned into something like this:

but the and a
but 2 2 1 1
the 2 2 1 1
and 1 1 1 0
a 1 1 0 1

But at larger scale. I think this toy example should be transferable though. I'm not sure where to even start here, but tidyverse solutions are preferred.

nlplearner
  • 115
  • 1
  • 10

1 Answers1

0

Following this answer:

dat <- crossprod(table(df))
nlplearner
  • 115
  • 1
  • 10