1

I am very new to R, and trying to reconfigure a large dataframe (several million records) to permit analysis.

The dataframe looks like:

ID   Dx1   Dx2   Dx3 .....Dx33
1    12    11    0 
2    3     1     2
3    44    9     6  

I need to convert it to look like

ID   Dx
1    12
1    11
1    0
2    3
2    4

etc..

Assistance with this would be greatly appreciated. Thanks!

In case anyone is interested: The dataframe has a subject ID number and 1 to 33 associated ICD10 codes (diagnosis codes). I plan to use the Comorbidity package (https://github.com/ellessenne/comorbidity) to calculate Elixhauser scores for each subject. Comorbidity requires the data to be in this format.

RROBINSON
  • 191
  • 1
  • 2
  • 11
  • Try `library(tidyr); gather(df1, key, Dx, - ID) %>% select(-key)` – akrun Apr 22 '18 at 15:56
  • Some optimization and tidiness can be achieved by knowing that wide and long diagnostic data have particular characteristics. The CRAN package [icd](https://cran.r-project.org/package=icd) of which I'm the author, offers `icd::wide_to_long` and `icd::long_to_wide`. They actually use base R internally, but do the job quickly, accounting for the typical structure of diagnostic data. – Jack Wasey Jan 13 '19 at 19:00

0 Answers0