I have a question regarding comparing columns in a data frame.... Say I have a few data that look like this:
Unique <- c("apple", "orange", "melon", "car", "mouse", "headphones", "light")
a1 <- c("apple", "tomato", "banana", "dog", "cat", "headphones", "future")
a2 <- c("apple", "orange", "pear", "monkey", "dog", "cat", "river")
a3 <- c("tomato", "pineapple", "cherry", "car", "space", "mars", "rocket")
df <- data.frame(Unique, a1, a2, a3)
df
> ## df
## Unique a1 a2 a3
## 1: apple apple apple tomato
## 2: orange tomato orange pineapple
## 3: melon banana pear cherry
## 4: car dog monkey car
## 5: mouse cat dog space
## 6: headphones headphones cat mars
## 7: light future river rocket
The question I am trying to answer is: what is the frequency of each cell of column "Unique" to appear in the entire data frame except in Unique column?
I would like an output that looks something like this:
apple 2
orange 1
melon 0
car 1
mouse 0
headphones 0
light 0
because in the entire data frame except the "Unique" column, apple appears 2 times, orange appears 1 time, melon appears 0 time, so on and so forth...
How would you go about getting this?
Also, how would we sort them based on the number of frequency, say highest to lowest?
I have been trying to figure this out for a couple of days now, and I just can't crack it... any help would be extremely appreciated!
p.s. also, in R, it seems like each "cell" in a dataframe is not referred to a cell..? am I correct? What are they referred to, elements?