First up - I don´t think this is a duplicate post. I have found several great posts and webpages on transforming multiple columns from wide to long format, but none of them are similar to my data since they deal with several columns that are measures of the same thing (in my example that would be A1, A2, A3, A4), and also they do not already contain a long format variable (frame in my case).
Here is my question:
I am working with a dataset containing many variabels measured by two different motion capture systems. Currently my dataset is in wide format, but I´ve realized ggplot works much better with long format, and I therefore wish to transform my data.
Here is a very simplified version of my data:
id <- (rep(1:3, each = 3))
frame <- (rep(1:3, 3))
A1 <- runif(9, min =1, max =100)
B1 <- runif(9, min =1, max =10)
C1 <- runif(9, min =-10, max =10)
A2 <- rnorm(9, mean = A1, sd=1)
B2 <- rnorm(9, mean = B1, sd=1)
C2 <- rnorm(9, mean = C1, sd=1)
df_wide <- as.data.frame.matrix(cbind(id, frame, A1, B1, C1, A2, B2, C2))
rm(id, frame, A1, A2, B1, B2, C1, C2)
df_wide$id <- as.factor(df_wide$id)
df_wide$frame <- as.factor(df_wide$frame)
head(df_wide)
id frame A1 B1 C1 A2 B2 C2
1 1 1 50.940395 4.141713 -1.294736 51.324398 4.271260 0.6174782
2 1 2 33.117691 5.044080 1.820367 32.977860 5.506677 0.8811504
3 1 3 50.000625 8.584148 -1.294245 50.603195 8.099262 0.6418580
4 2 1 61.675927 5.269216 -6.002856 61.996378 6.186417 -6.5428624
5 2 2 5.514353 6.570010 5.199728 4.798275 4.955662 5.1502535
6 2 3 51.580086 5.683788 9.831663 50.717459 5.430070 10.9601541
A1 and A2, B1 and B2, are measures of the same type of movements (A and B) made by system 1 and system 2. Each patient has been measured several times as indicated by the frame variable.
I would like my data.frame to look like this:
id frame system A B C
1 1 1 1
2 1 1 2
3 1 2 1
4 1 2 2
5 1 3 1
6 1 3 2
I have two problems keeping me from solving this:
1) measurements between the two systems are not placed next to each other. Therefore I cant use code like this:
library(tidyr)
df_long <- gather(df_wide, System, A, A1:A2, factor_key=TRUE)
2) My dataset contains close to 120 variabels, so I would like a solution that wouldn't require me to write the code for each variable. I am considering making a loop to solve this, but any help in this regard will be much appreciated as well.