5

I have some data that is not tidy. It has two nested repeated measures (Q1/Q2 nested within Constructs). I'd like to move it from wide to long format.

##    id time Q1..Ask Q2..Ask Q1..Tell Q2..Tell Q1..Respond Q2..Respond
## 1   1  pre       1       1        1        1           0           0
## 2   2  pre       0       1        1        0           0           1
## 3   3  pre       0       0        1        0           0           0
## 4   4  pre       1       1        0        1           1           0
## 5   5  pre       0       0        0        0           0           0
## 6   1 post       0       0        1        1           0           1
## 7   2 post       0       0        1        1           0           0
## 8   3 post       0       0        0        1           0           0
## 9   4 post       1       0        1        1           0           0
## 10  5 post       0       1        0        1           1           1

Here question 1 and question 2 (Q1 & Q2) are two different questions aimed at the same construct. So Q1..Ask Q2..Ask are scores for question 1 and 2 targeted at an Ask construct. How can I make the Q1/Q2 into a column (Question) and the latter part of the column headers into a Construct column, with a Score column using tidyr?

# MWE

if (!require("pacman")) install.packages("pacman")
pacman::p_load(dplyr, tidyr)

set.seed(10)
dat <- data_frame(
    id = c(1:5, 1:5),
    time = rep(c("pre", "post"), each = 5),
    Q1..Ask = sample(0:1, 10, TRUE),
    Q2..Ask = sample(0:1, 10, TRUE),
    Q1..Tell = sample(0:1, 10, TRUE),
    Q2..Tell = sample(0:1, 10, TRUE),
    Q1..Respond = sample(0:1, 10, TRUE),
    Q2..Respond = sample(0:1, 10, TRUE)
)

# Code to make it long format not in tidyr

Map(function(x, y) {

    data_frame(
        ID = rep(dat[["id"]], 2),
        Time = rep(dat[["time"]], 2),
        Question = rep(c("Q1", "Q2"), each=10),
        Construct = rep(gsub("Q[12]\\.+", "", colnames(dat)[x]), 20),
        Score = c(dat[[x]], dat[[y]])
    ) 

}, c(3, 5, 7), c(4, 6, 8)) %>%
    rbind_all 

# Desired Output

##    ID Time Question Construct Score
## 1   1  pre       Q1       Ask     1
## 2   2  pre       Q1       Ask     0
## 3   3  pre       Q1       Ask     0
## 4   4  pre       Q1       Ask     1
## 5   5  pre       Q1       Ask     0
## 6   1 post       Q1       Ask     0
## 7   2 post       Q1       Ask     0
## 8   3 post       Q1       Ask     0
## 9   4 post       Q1       Ask     1
## 10  5 post       Q1       Ask     0
## 11  1  pre       Q2       Ask     1
## 12  2  pre       Q2       Ask     1
## 13  3  pre       Q2       Ask     0
## 14  4  pre       Q2       Ask     1
## 15  5  pre       Q2       Ask     0
## 16  1 post       Q2       Ask     0
## 17  2 post       Q2       Ask     0
## 18  3 post       Q2       Ask     0
## 19  4 post       Q2       Ask     0
## 20  5 post       Q2       Ask     1
## 21  1  pre       Q1      Tell     1
## 22  2  pre       Q1      Tell     1
## 23  3  pre       Q1      Tell     1
## 24  4  pre       Q1      Tell     0
## 25  5  pre       Q1      Tell     0
## 26  1 post       Q1      Tell     1
## 27  2 post       Q1      Tell     1
## 28  3 post       Q1      Tell     0
## 29  4 post       Q1      Tell     1
## 30  5 post       Q1      Tell     0
## 31  1  pre       Q2      Tell     1
## 32  2  pre       Q2      Tell     0
## 33  3  pre       Q2      Tell     0
## 34  4  pre       Q2      Tell     1
## 35  5  pre       Q2      Tell     0
## 36  1 post       Q2      Tell     1
## 37  2 post       Q2      Tell     1
## 38  3 post       Q2      Tell     1
## 39  4 post       Q2      Tell     1
## 40  5 post       Q2      Tell     1
## 41  1  pre       Q1   Respond     0
## 42  2  pre       Q1   Respond     0
## 43  3  pre       Q1   Respond     0
## 44  4  pre       Q1   Respond     1
## 45  5  pre       Q1   Respond     0
## 46  1 post       Q1   Respond     0
## 47  2 post       Q1   Respond     0
## 48  3 post       Q1   Respond     0
## 49  4 post       Q1   Respond     0
## 50  5 post       Q1   Respond     1
## 51  1  pre       Q2   Respond     0
## 52  2  pre       Q2   Respond     1
## 53  3  pre       Q2   Respond     0
## 54  4  pre       Q2   Respond     0
## 55  5  pre       Q2   Respond     0
## 56  1 post       Q2   Respond     1
## 57  2 post       Q2   Respond     0
## 58  3 post       Q2   Respond     0
## 59  4 post       Q2   Respond     0
## 60  5 post       Q2   Respond     1
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • Why do you want to / need to use tidyr? Also tidy probably isn't an appropriate tag. – Dason Apr 06 '15 at 14:39
  • @Dason fixed the tag and I am trying to streamline packages in workflow. I was hoping there was a `tidyr` approach that was comparable to `reshape` in base. – Tyler Rinker Apr 06 '15 at 14:55

1 Answers1

6

Try

library(tidyr)
 gather(dat, Var, Score, -id, -time) %>% 
             extract(Var, c('Question', 'Construct'), 
                     '([^.]+)..([^.]+)') 
akrun
  • 874,273
  • 37
  • 540
  • 662