-1

I have a dataframe where all information is in one row. see picture below

untidy data

I need to change it to something like this tidy data

so the first value (suffix_name) in the row should be changed to a variable and the second value (none) should be first value of new variable (suffix_name) please see images

shabuya
  • 11
  • 1
  • 1
    It doesn't look like it is a data.frame yet, but a filtered excel file. Why don't you `saveas` your excel file as a .csv and read it with read.csv? – Chris Oct 07 '22 at 21:52
  • Please make a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) or [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) with a sample input and your expected output. [Please do not upload images of code/data/errors when asking a question.](//meta.stackoverflow.com/q/285551) Based on your input we can understand your problem and think about a possible solution and verify it compared to your expected output. – Martin Gal Oct 07 '22 at 22:41

1 Answers1

0

with the following code you can split the information of the row. You need to use the library data.table to subset the vars or values columns.

library(data.table)
data_raw.dt <- data.table(
       V2_00011 = "'SUFFIX_NAME'",
       V2_00012 = 'NONE}}',
       V2_00013 = "'PATIENT_ID'",
       V2_00014 = "'CZMIl1844982497'",
       V2_00015 = "'BIRTH_DATE'",
       V2_00016 = "'1987-01-01'",
       V2_00017 = "'GENDER'",
       V2_00018 = "'Unknown'",
       V2_00019 = "'OBSCURITY_LEVEL'",
       V2_00020 = "'Normal'")

vars <- seq(1, ncol(data_raw.dt), by = 2)
vals <- seq(2, ncol(data_raw.dt), by = 2)

data_ref.dt <- data.table(matrix(data_raw.dt[, ..vals], ncol = length(vals)))
names(data_ref.dt) <- paste(data_raw.dt[, ..vars])

Here you can see the results.

print(data_raw.dt)
V2_00011 V2_00012     V2_00013          V2_00014     V2_00015     V2_00016 V2_00017  V2_00018          V2_00019 V2_00020
'SUFFIX_NAME'   NONE}} 'PATIENT_ID' 'CZMIl1844982497' 'BIRTH_DATE' '1987-01-01' 'GENDER' 'Unknown' 'OBSCURITY_LEVEL' 'Normal'

print(data_ref.dt)
'SUFFIX_NAME'      'PATIENT_ID' 'BIRTH_DATE'  'GENDER' 'OBSCURITY_LEVEL'
NONE}} 'CZMIl1844982497' '1987-01-01' 'Unknown'          'Normal'