Let's start with:
df <- structure(list(user.id = c(2L, 3L, 1L, 3L, 1L, 4L), questions = structure(c(3L,
3L, 3L, 1L, 3L, 2L), .Label = c("Do you own an xbox?", "How many game consoles do you own?",
"which game did you buy recently?"), class = "factor"), answers = structure(c(2L,
5L, 3L, 6L, 4L, 1L), .Label = c("3", "DOOM", "Fallout 3", "Ghost Recon",
"Mario", "yes"), class = "factor")), .Names = c("user.id", "questions",
"answers"), row.names = c(NA, -6L), class = "data.frame")
This gives us the data.frame
> df
user.id questions answers
1 2 which game did you buy recently? DOOM
2 3 which game did you buy recently? Mario
3 1 which game did you buy recently? Fallout 3
4 3 Do you own an xbox? yes
5 1 which game did you buy recently? Ghost Recon
6 4 How many game consoles do you own? 3
I'd like to transform this to a data.frame or equivalent where:
> matrixed
user.id q_1 q_2 q_3
1 1 Ghost Recon
2 2 DOOM
3 3 yes Mario
4 4 3
Right now I'm using this primitive piece of code:
questions <- sort(unique(df$questions))
user.id <- unique(sort(df$user.id))
matrixed <- data.frame(user.id)
sapply(1:length(questions), function(i) matrixed[, paste0("q_", i)] <<- rep("", length(user.id)))
sapply(1:nrow(df), function(j) matrixed[df[j, ]$user.id, paste0("q_", which(df[j, ]$questions == questions))] <<- as.character(df[j, ]$answers))
Are there more elegant ways to do this -- perhaps libraries that help handle this type of data?