I am using the pairwise_cor function to generate word correlations in text. The problem is, the output contains duplicate rows when taking into account the combination of two columns. I want to filter out the duplicated rows.
Here is an extremely simplified example:
item1 <- c("dog", "cat", "horse", "cow", "chicken", "sheep")
item2 <- c("cat", "dog", "cow", "horse", "sheep", "chicken")
correlation <- c(2,2,3,3,4,4)
df <- cbind.data.frame(item1, item2, correlation)
df
item1 item2 correlation
1 dog cat 2
2 cat dog 2
3 horse cow 3
4 cow horse 3
5 chicken sheep 4
6 sheep chicken 4
So essentially, rows 1 and 2 are the same, just with item1 and item2 flip-flopped. And the same for rows 3 and 4, and rows 5 and 6.
I want the output data frame, after filtering to be:
item1 item2 correlation
1 dog cat 2
2 horse cow 3
3 chicken sheep 4
It is not just a simple matter of using unique(), at least that I can figure.