Adding values for missing rows based on a column value

Question

I have a data frame:

things <- data.frame( category = c("A","B","A","B","B","A","B"),
               things2do = c("ball","ball","bat","bat","hockey","volley ball","foos ball"),
                  number = c(12,5,4,1,2,1,1))

now I want to add "0" in number where the particular category and things2do is missing e.g. a new row for "A", "hockey" and "0" should be added, same for volleyball, and foos ball.

I hope I can get some help here.

score 3 · Accepted Answer · answered Jun 08 '17 at 05:58

tidyr's complete() function does this:

library(tidyr)

things %>%
    complete(category, things2do, fill = list(number = 0))

Output:

# A tibble: 10 x 3
   category   things2do number
     <fctr>      <fctr>  <dbl>
 1        A        ball     12
 2        A         bat      4
 3        A   foos ball      0
 4        A      hockey      0
 5        A volley ball      1
 6        B        ball      5
 7        B         bat      1
 8        B   foos ball      1
 9        B      hockey      2
10        B volley ball      0

akrun · Answer 2 · 2017-06-08T06:30:30.540

We can do this with expand.grid from base R

d1 <- merge(expand.grid(category = unique(things$category), 
        things2do = unique(things$things2do)), things, all.x = TRUE)

d1$number[is.na(d1$number)] <- 0
d1
#   category   things2do number
#1         A        ball     12
#2         A         bat      4
#3         A   foos ball      0
#4         A      hockey      0
#5         A volley ball      1
#6         B        ball      5
#7         B         bat      1
#8         B   foos ball      1
#9         B      hockey      2
#10        B volley ball      0

NOTE: Not used any external packages

Adding values for missing rows based on a column value

2 Answers2