0

With the following data frame, I would like to create new columns based on the "Type" column values using 'mutate' and count the number of instances that appear. The data should be grouped by "Group" and "Choice".

Over time, the "Type" column will have new values added in that aren't already listed, so the code should be flexible in that respect.

Is this possible using the dplyr library?

library(dplyr)

df <- data.frame(Group = c("A","A","A","B","B","C","C","D","D","D","D","D"),
             Choice = c("Yes","Yes","No","No","Yes","Yes","Yes","Yes","No","No","No","No"),
             Type = c("Fruit","Construction","Fruit","Planes","Fruit","Trips","Construction","Cars","Trips","Fruit","Planes","Trips"))

The desired result should be the following:

result <- data.frame(Group = c("A","A","B","B","C","D","D"),
                 Choice = c("Yes","No","Yes","No","Yes","Yes","No"),
                 Fruit = c(1,1,0,1,0,0,1),
                 Construction = c(0,1,0,0,1,0,0),
                 Planes = c(0,0,1,0,0,0,1),
                 Trips = c(0,0,0,0,1,0,2),
                 Cars = c(0,0,0,0,0,1,0))
Dfeld
  • 187
  • 9

1 Answers1

1

We can do a count and then spread

library(tidyverse)
df %>% 
   count(Group, Choice, Type) %>%
   spread(Type, n, fill = 0)
# A tibble: 7 x 7
#  Group Choice  Cars Construction Fruit Planes Trips
#  <fct> <fct>  <dbl>        <dbl> <dbl>  <dbl> <dbl>
#1 A     No         0            0     1      0     0
#2 A     Yes        0            1     1      0     0
#3 B     No         0            0     0      1     0
#4 B     Yes        0            0     1      0     0
#5 C     Yes        0            1     0      0     1
#6 D     No         0            0     1      1     2
#7 D     Yes        1            0     0      0     0
akrun
  • 874,273
  • 37
  • 540
  • 662