0

I would like to reshape my long dataframe into wide format with tidyverse::spread. What should I do to include all the variables for transformation.

My input:

df <- tibble::tribble(
  ~respid, ~member_id, ~gender, ~edu,  ~dob,
     100L,  1L,      1L,   3L, 1978L,
     100L,  2L,      1L,   3L, 1980L,
     200L,  1L,      1L,   4L, 1974L,
     200L,  2L,      2L,   5L, 1955L,
     300L,  1L,      2L,   3L, 1998L,
     300L,  2L,      1L,   4L, 1999L,
     300L,  3L,      2L,   3L, 2001L
  )

desire output:

output <- tibble::tribble(
  ~respid, ~gender_1, ~edu_1, ~dob_1, ~gender_2, ~edu_2, ~dob_2, ~gender_3, ~edu_3, ~dob_3,
     100L,        1L,     3L,  1978L,        1L,     3L,  1980L,        NA,     NA,     NA,
     200L,        1L,     4L,  1974L,        2L,     5L,  1955L,        NA,     NA,     NA,
     300L,        2L,     3L,  1998L,        1L,     4L,  1999L,        2L,     3L,  2001L
  )

here I tried to make it, but row_number() dose not look right.

df %>%
  group_by(member_id) %>% 
  mutate(t1 = paste0("gender_" , row_number())) %>%
  spread(t1, gender)
Ann
  • 328
  • 1
  • 4
  • 14

1 Answers1

4

You can do:

df %>%
 gather(var, val, -c(respid, member_id)) %>%
 mutate(var = paste(var, member_id, sep = "_")) %>%
 select(-member_id) %>%
 spread(var, val)

  respid dob_1 dob_2 dob_3 edu_1 edu_2 edu_3 gender_1 gender_2 gender_3
   <int> <int> <int> <int> <int> <int> <int>    <int>    <int>    <int>
1    100  1978  1980    NA     3     3    NA        1        1       NA
2    200  1974  1955    NA     4     5    NA        1        2       NA
3    300  1998  1999  2001     3     4     3        2        1        2

First, it is transforming the data from wide to long format. Second, it creates the new variable names. Finally, it returns it back to wide format.

Or using reshape2:

dcast(melt(df, id.vars = c("respid", "member_id")), respid~variable+member_id, value.var = "value")

  respid gender_1 gender_2 gender_3 edu_1 edu_2 edu_3 dob_1 dob_2 dob_3
1    100        1        1       NA     3     3    NA  1978  1980    NA
2    200        1        2       NA     4     5    NA  1974  1955    NA
3    300        2        1        2     3     4     3  1998  1999  2001
tmfmnk
  • 38,881
  • 4
  • 47
  • 67