I need to convert a wide dataset to long and there are 16 columns which must converge to 4. Each 4 columns contain information related to one another and that information must not be "lost" in the transformation.
I have data from a ranking task of four block which has essentially given me a data set where the information is divided into four groups in a wide format. I.e first_image, first_sex, first_score, second_image, second_sex, second_score...
I have tried various combinations of group_by and gather() but I'm nowhere close.
I've already read Reshaping multiple sets of measurement columns (wide format) into single columns (long format) but I'm none the wiser I'm afraid.
I've made some sample data of what one participant's data looks like and I've also made a sample of how I would like the data to look.
library(tidyverse)
sample_dat <- data.frame(subject_id = rep("sj1", 4),
first_pick = rep(1, 4),
first_image_pick = (c("a", "b", "c", "d")),
first_pick_neuro = rep("TD", 4),
first_pick_sex = rep("F", 4),
second_pick = rep(2, 4),
second_image_pick = (c("e", "f", "g", "h")),
second_pick_neuro = rep("TD", 4),
second_pick_sex = rep("M", 4),
third_pick = rep(3, 4),
third_image_pick = (c("i", "j", "k", "l")),
third_pick_neuro = rep("DS", 4),
third_pick_sex = rep("F", 4),
fourth_pick = rep(4, 4),
fourth_image_pick = (c("m", "n", "o", "p")),
fourth_pick_neuro = rep("DS", 4),
fourth_pick_sex = rep("M", 4))
Expected output:
final_data <- data.frame(subject_id = rep("sj1", 16),
image = c("a", "b", "c", "d",
"e", "f", "g", "h",
"i", "j", "k", "l",
"m", "n", "o", "p"),
rank = rep(c(1, 2, 3, 4), each = 4), # from the numbers in the first_pick, second_pick etc.
neuro = rep(c("TD", "DS"), each = 8),
sex = rep(c("F", "M", "F", "M"), each = 4))
So far I've tried this, however it only duplicate all the information:
sample_dat_long <- sample_dat %>%
group_by(subject_id) %>%
gather(Pick, Image,
first_image_pick,
second_image_pick,
third_image_pick,
fourth_image_pick)
So essentially I don't want to lose the information for each image (pick, sex, neuro) when I gather my data.
Any help would be amazing!