1
mydata <- data.frame(id = c(1,1,1,2,2,3,4),
                     hobby = c("music", "sports", "science", "science", "lifestyle", 
                               "party", "sports"),
                     x = c(10, 10, 10, 23, 23, 11, 0),
                     y = c(78, 78, 78, 55, 55, 22, 9))
> mydata
  id     hobby  x  y
1  1     music 10 78
2  1    sports 10 78
3  1   science 10 78
4  2   science 23 55
5  2 lifestyle 23 55
6  3     party 11 22
7  4    sports  0  9

I would like to reshape the above data.frame to a wide format, where there are additional columns for each hobby:

  id music sports science lifestyle party  x  y
1  1     1      1       1         0     0 10 78
2  2     0      0       1         1     0 23 55
3  3     0      0       0         0     1 11 22
4  4     0      1       0         0     0  0  9

What's an efficient way of doing this in R if there are many different categories of hobby?

Adrian
  • 9,229
  • 24
  • 74
  • 132

1 Answers1

1
library(dplyr)
library(tidyr)
mydata %>% 
  mutate(value = 1) %>%
  pivot_wider(names_from = "hobby", values_from = value, values_fill = 0)
##  A tibble: 4 × 8
#      id     x     y music sports science lifestyle party
#   <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>     <dbl> <dbl>
# 1     1    10    78     1      1       1         0     0
# 2     2    23    55     0      0       1         1     0
# 3     3    11    22     0      0       0         0     1
# 4     4     0     9     0      1       0         0     0
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294