I have the following sample data frame:
df <- data.frame("Alpha" = c(NA, NA, 6, 5, 4, 6, 5, 3), "Beta" = c(3, 3, 4, 2, 6, NA, NA, NA), "Gamma" =c(6, 2, 3, 1, NA, NA, 5, 4))
From this data, I would like to get a count of all values between 0 and 6 for each column. The data frame does not contain all values between 0 and 6, so the final output would look something like this:
result <- data.frame("value"=c(0, 1, 2, 3, 4, 5, 6),
"Alpha"=c(0, 0, 0, 1, 1, 2, 2),
"Beta"=c(0, 0, 1, 2, 1, 0, 1),
"Gamma"=c(0, 1, 1, 1, 1, 1, 1))
value Alpha Beta Gamma
0 0 0 0
1 0 0 1
2 0 1 1
3 1 2 1
4 1 1 1
5 2 0 1
6 2 1 1
My first inclination was to reiterate the distinct() function in dplyr. I was thinking of using something like this:
df.alpha <- df %>% distinct(Alpha)
df.beta <- df %>% distinct(Beta)
df.gamma <- df %>% distinct(Gamma)
Afterward, I would bind them together. However, I encounter three issues:
- There's a lot of copy and pasting here (there are more columns in my real data frame)
- The results do not have the same length, which makes binding difficult; and
- "0" is not a value in the original table, so it does not get counted in the results.
I found a similar question in this stackoverflow post on counting a specific value in multiple columns at once. However, unlike that post, the issue I have here is that there is no variable to "group by".
Do folks have any suggestions on how I can produce a count of values between a range of integers for all columns? Thanks so much!