2

I have run a survey using Google Forms. I downloaded the response dataset as a spreadsheet, but unfortunately when it comes to multiple choice, multiple anwsers responses, the data looks something like this:

Q1   Q2        Q3
1   "A, B ,C"  S 
2   "C, D"     T
1   "A, C, E"  U
3   "D"        V
2   "B, E"     Z

I would like to have it in a form similar to the below:

Q1  Q2        Q2A Q2B Q2C Q2D Q2E   Q3
1   "A, B, C"  1   1   1   0   0     S
2   "C, D"     0   0   1   1   0     T
1   "A, C, E"  1   0   1   0   1     U
3   "D"        0   0   0   1   0     V
2   "B, E"     0   1   0   0   1     Z

Is there a clever way to do this? I have several multiple choice, multiple answers questions and more than 250 respondents, so I'd like to be able to do it easily.

Thanks in advance.

SiKiHe
  • 439
  • 6
  • 16
  • 2
    I just tried the solution in the question you marked my question as an "exact replica of". It doesn't work in my situation, so I guess you're wrong about it being an exact replica. Could you please unblock it so that I can get my problem solved? – SiKiHe May 25 '17 at 14:18

1 Answers1

2

Using dplyr and tidyr packages:

 dat %>% 
    separate(Q2, paste0("v", 1:5), remove=F) %>% 
    gather(q2, val, v1:v5) %>% 
    na.exclude %>% 
    mutate(val=paste0("Q2", val), q2=1) %>% 
    spread(val, q2) %>%
    select(Q1:Q2, Q2A:Q2E, Q3) %>%
    mutate_at(vars(Q2A:Q2E), .funs=funs(replace(., is.na(.), 0)))


  Q1      Q2 Q2A Q2B Q2C Q2D Q2E Q3
1  1 A, B ,C   1   1   1   0   0  S
2  1 A, C, E   1   0   1   0   1  U
3  2    B, E   0   1   0   0   1  Z
4  2    C, D   0   0   1   1   0  T
5  3       D   0   0   0   1   0  V

Input data:

dat <- structure(list(Q1 = c(1L, 2L, 1L, 3L, 2L), Q2 = structure(c(1L, 
4L, 2L, 5L, 3L), .Label = c("A, B ,C", "A, C, E", "B, E", "C, D", 
"D"), class = "factor"), Q3 = structure(1:5, .Label = c("S", 
"T", "U", "V", "Z"), class = "factor")), .Names = c("Q1", "Q2", 
"Q3"), class = "data.frame", row.names = c(NA, -5L))
Adam Quek
  • 6,973
  • 1
  • 17
  • 23
  • with this method, 5 observations (rows) become 11...something were wrong? – S.F. Yeh Jun 06 '22 at 14:50
  • @S.F.Yeh I'm not sure what you mean. I manage to reproduce the same output. I would however recommend you looking for newer answers with tidyverse methodology though. The functions I'd used (e.g. `gather` and `spread`) are no longer in development. – Adam Quek Jun 07 '22 at 01:12
  • oh, that's right. I tried to understand what each line can do by separating each step in the pipe, and some error happened. I'll figure it out by myself. Sorry for bothering you, Thank you! – S.F. Yeh Jun 07 '22 at 01:33