cleaning wide data by rearrange columns by same variable and then ordering by time

Question

I have a wide data set that I want to rearrange. It looks something like this:

id <- c(100,101,102) 
variablea_1 <- c(1,1,1)
variableb_1 <- c(1,1,1)
variablec_1 <- c(1,1,1)
varibaled_1 <- c(1,1,1)
variablea_2 <- c(1,1,1)
variableb_2 <- c(1,1,1)
variablec_2 <- c(1,1,1)
varibaled_2 <- c(1,1,1)
variablea_3 <- c(1,1,1)
variableb_3 <- c(1,1,1)
variablec_3 <- c(1,1,1)
varibaled_3 <- c(1,1,1)


Data <- data.frame(patientid=patientid, variablea_1= variablea_1, variableb_1 =variableb_1,variablec_1 =variablec_1, varibaled_1 =varibaled_1, variablea_2 =variablea_2,
variableb_2 = variableb_2, variablec_2 = variablec_2, varibaled_2 = varibaled_2, variablea_3 = variablea_3, variableb_3 =variableb_3, variablec_3 <- variablec_3, varibaled_3 = varibaled_3)```

The data itself is unimportant. Looking for a way rearrange columns so they are in order of variables grouped that are the same and then proceed by time point (In my actual dataset I have 80 variables at three timepoints so 240 columns total) So desired output would look like this:

Data <- data.frame(patientid=patientid, variablea_1= variablea_1, variablea_2 =variablea_2,variablea_3 = variablea_3,variableb_1 =variableb_1, variableb_2 = variableb_2,variableb_3 =variableb_3,variablec_1 =variablec_1, variablec_2 = variablec_2,variablec_3 <- variablec_3, varibaled_1 =varibaled_1, varibaled_2 = varibaled_2, varibaled_3 = varibaled_3)```

FYI, code blocks use *either* of: (1) "code-fence" of three backticks (`\`\`\``) before and after the block of code, *each on a line of its own* (no code); the first code-fence can optionally include a language hint, R's would be `\`\`\`lang-r\nsomecode\n\`\`\``; or (2) indent (no code hint). There are times to use both indentation and three-backticks, but in general it is not needed. Never are three backticks on the same line as code. See https://stackoverflow.com/editing-help for reference. — r2evans, Dec 17 '20 at 17:13
`Data[, c(1, order(names(Data)[-1], nchar(names(Data)[-1])) + 1)]` perhaps? — A5C1D2H2I1M1N2O1R2T1, Dec 17 '20 at 17:18
I'd consider this a duplicate of https://stackoverflow.com/q/17531403/1270695, with the exception of wanting to use the result of sorting to reorder a `data.frame`. The only other difference is the ` + 1` adjustment that's being used to adjust for the id column, which is not included in the ordering. — A5C1D2H2I1M1N2O1R2T1, Dec 17 '20 at 17:22

score 0 · Answer 1 · answered Dec 17 '20 at 17:10

Here is an option where we remove the substring of the column names, get the sequence with ave, order on that and use that to reorder the columns

v1 <- as.numeric(sub(".*_", "", names(Data)[-1]))
Data1 <- Data[c(1, order(ave(v1, v1, FUN = seq_along)) + 1)]

cleaning wide data by rearrange columns by same variable and then ordering by time

1 Answers1