I'm trying to tidy my data in my R Script so that I can run some statistical analyses on the tidied data set.
One of the columns lists pairs (6 of these), which correspond to three separate "blocks" of output values. The minimal reproducible dataset is below.
dput(head(data, 6))
structure(list(pairs = c("ABC", "ACB", "BAC", "BCA", "CBA", "CAB"), block1vals = c(1, 3, 5, 7, 9, 10), block2vals = c(4, 66, 34, 66, 21, 21), block3vals = c(53, 22, 12, 65, 21, 22)), .Names = c("pairs", "block1vals", "block2vals", "block3vals"), row.names = c(NA, 6L), class = "data.frame")
I got my code to take the pairs and label each participant's A/B/C value for a given block, a column for each block; this works:
Block 1:
data$block1types <- sapply(data$pairs, function(x){
if(x == "ABC") { return("Type A")}
if(x == "ACB") { return("Type A")}
if(x == "BAC") { return("Type B")}
if(x == "BCA") { return("Type B")}
if(x == "CBA") { return("Type C")}
if(x == "CAB") { return("Type C")}
})
Block 2:
data$block2types <- sapply(data$pairs, function(x){
if(x == "ABC") { return("Type B")}
if(x == "ACB") { return("Type C")}
if(x == "BAC") { return("Type A")}
if(x == "BCA") { return("Type C")}
if(x == "CBA") { return("Type B")}
if(x == "CAB") { return("Type A")}
})
Block 3:
data$block3types <- sapply(data$pairs, function(x){
if(x == "ABC") { return("Type C")}
if(x == "ACB") { return("Type B")}
if(x == "BAC") { return("Type C")}
if(x == "BCA") { return("Type A")}
if(x == "CBA") { return("Type A")}
if(x == "CAB") { return("Type B")}
})
What I am trying to do is to now reorganize the data so that there is a column with all "Type A" participant values (doesn't matter which block A was in) as well as one for "Type B" and one for "Type C."
So the ideal output is:
data$TypeA <- c(1, 3, 34, 65, 21, 21)
data$TypeB <- c(4, 22, 5, 7, 21, 22)
data$TypeC <- c(53, 66, 12, 66, 9, 10)
I cannot figure out how to do this without making problems. My attempt to do so was this, creating two columns outside the data set, which I hoped I could then spread:
BlockTypes<- combine(data$block1types, data$block2types, data$block3types, .id = NULL)
BlockTotals<- combine(data$block1vals, data$block2vals, data$block3vals, .id = NULL)
I then tried to do this:
spread(data, key= BlockTypes, value=BlockTotals, fill = 0)
This failed: var
must evaluate to a single number or a column name, not a character vector. I do think, though, that the bigger problem was putting the columns outside the data set. I couldn't use the spread function with them, since they were outside of the data set. So I am a bit stuck on how to do this, if the combine function cannot be used with a tibble.