This can be done with dplyr
. The latest version of dplyr
(0.8.0.1 as of this writing) has a .drop=FALSE
option for grouping variables that preserves empty groups. For the preservation of empty groups to work, the grouping columns must all be factor class:
library(dplyr)
library(tidyr)
df %>%
# Convert grouping columns to factor if they aren't already
mutate_if(is.character, factor) %>%
group_by(sample, drug, .drop=FALSE) %>%
tally %>%
spread(drug, n)
sample drug1 drug2
1 s1 0 1
2 s2 1 1
Or, to keep the output in "long" format for further processing, stop before the spread
:
df %>%
mutate_if(is.character, factor) %>%
group_by(sample, drug, .drop=FALSE) %>%
tally
sample drug n
1 s1 drug1 0
2 s1 drug2 1
3 s2 drug1 1
4 s2 drug2 1
The code above will ensure that all empty group combinations are preserved. However, if you're going to spread the data to a "wide" format table, then we can take care of the missing groups in the spread
step without worrying about whether group_by
preserves empty groups:
df %>%
group_by(sample, drug) %>%
tally %>%
spread(drug, n, fill=0)