Include 0 counts as frequency and create new columns

Question

suppose I have a data.frame

df = data.frame ( 
    sample = c ( "s1","s2","s2"), 
    drug = c( "drug2" , "drug1", "drug2")
)

  sample  drug
1     s1 drug2
2     s2 drug1
3     s2 drug2

Is there any easy way to create a table counting all instances of drugs including zero hits?

ideally, something like this.

samle drug1 drug2
1    s1     0     1
2    s2     1     1

Maurits Evers · Answer 1 · 2019-03-20T23:29:49.910

3

What about base R's good old table?

table(df)
#      drug
#sample drug1 drug2
#s1     0     1
#s2     1     1

Or to get a matrix output

as.data.frame.matrix(table(df))
#   drug1 drug2
#s1     0     1
#s2     1     1

edited Mar 20 '19 at 23:29

answered Mar 20 '19 at 23:19

Maurits Evers

49,617
4
47
68

eipi10 · Answer 2 · 2019-03-20T23:24:23.077

This can be done with dplyr. The latest version of dplyr (0.8.0.1 as of this writing) has a .drop=FALSE option for grouping variables that preserves empty groups. For the preservation of empty groups to work, the grouping columns must all be factor class:

library(dplyr)
library(tidyr)

df %>% 
  # Convert grouping columns to factor if they aren't already
  mutate_if(is.character, factor) %>% 
  group_by(sample, drug, .drop=FALSE) %>% 
  tally %>% 
  spread(drug, n)

  sample drug1 drug2
1 s1         0     1
2 s2         1     1

Or, to keep the output in "long" format for further processing, stop before the spread:

df %>% 
  mutate_if(is.character, factor) %>% 
  group_by(sample, drug, .drop=FALSE) %>% 
  tally

  sample drug      n
1 s1     drug1     0
2 s1     drug2     1
3 s2     drug1     1
4 s2     drug2     1

The code above will ensure that all empty group combinations are preserved. However, if you're going to spread the data to a "wide" format table, then we can take care of the missing groups in the spread step without worrying about whether group_by preserves empty groups:

df %>% 
  group_by(sample, drug) %>% 
  tally %>% 
  spread(drug, n, fill=0)

Include 0 counts as frequency and create new columns

2 Answers2