1

I have the following data frame:

 STUDYID USUBJID        IDVAR     IDVARVAL                                    
  <chr>   <chr>             <chr>     <chr>                                       
1 study1  1                 DSSEQ     3                                           
2 study1  1                 DSSTINV   N                                           
3 study1  1                 DSDECOD1  SCREEN FAILURE                              
4 study1  2                 DSSEQ     1                                           
5 study1  2                 DSDECOD2  ADVERSE EVENT 

And I want to transpose it to the following format

 STUDYID USUBJID    DSSEQ   DSSTINV   DSDECOD1    DSDECOD2                                    
1 study1  1           3       N       SCREEN FAILURE                              
4 study1  2           1                            ADVERSE EVENT

I used:

supp_ds <- dcast(suppdsT, STUDYID + USUBJID ~ IDVAR, value.var="IDVARVAL")

but it gave me something like:

  STUDYID USUBJID DSDECOD1 DSDECOD2 DSDECOD3 DSDECOD4 DSDECOD7
1 study1   1        1        0        0        0        0
2 study2   2        0        0        0        0
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
sunflower
  • 53
  • 5
  • Does this answer your question? [How to reshape data from long to wide format](https://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format) – camille Mar 19 '20 at 22:26

1 Answers1

0

There could be duplicates in the dataset. So, it would by default, take the fun.aggregate as length. Instead, we can create a sequence column for the unique identifier

library(data.table)
dcast(setDT(suppdsT), STUDYID + USUBJID ~ IDVAR + rowid(USUBJID), 
                 value.var = 'IDVARVAL')
#   STUDYID USUBJID     DSDECOD1_3    DSDECOD2_2 DSSEQ_1 DSSTINV_2
#1:  study1       1 SCREEN FAILURE          <NA>       3         N
#2:  study1       2           <NA> ADVERSE EVENT       1      <NA>

data

suppdsT <- structure(list(STUDYID = c("study1", "study1", "study1", "study1", 
"study1"), USUBJID = c(1L, 1L, 1L, 2L, 2L), IDVAR = c("DSSEQ", 
"DSSTINV", "DSDECOD1", "DSSEQ", "DSDECOD2"), IDVARVAL = c("3", 
"N", "SCREEN FAILURE", "1", "ADVERSE EVENT")), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5"))
akrun
  • 874,273
  • 37
  • 540
  • 662