I have vector of sentences, say:
x = c("I like donut", "I like pizza", "I like donut and pizza")
I want to count combination of two words. Ideal output is a data-frame with 3 columns (word1, word2 and frequency), and would be something like this :
I like 3
I donut 2
I pizza 2
like donut 2
like pizza 2
donut pizza 1
donut and 1
pizza and 1
In the first records of output, freq = 3
because "I"
and "like"
occurs together 3 times: x[1]
, x[2]
and x[3]
.
Any advises are appreciated :)