I require a data frame df_wide
with following columns:
userID SAT GRE task_conf task_chall active_conf active_chall sleep_conf sleep_chall morn_conf morn_chall
30798 A 1400 2 3 5 2 6 1 4 2
30895 A 1200 6 2 5 3 5 2 5 3
32678 B 1000 5 3 6 3 6 2 5 2
34679 A 1300 4 3 4 2 6 1 6 3
35999 A 1400 2 2 2 2 2 2 2 2
Some information about the features:
The variables '_conf' and '_chall' contain integer values between 1 and 6
'userID's can be factors or integers but they are not continuous numbers
SAT represents the grade of that 'userID'
GRE represents the score of that 'userID'
SAT and GRE always stay the same for a given 'userID'
My original data df_long
is currently in the following form :
userID SAT GRE action ConfChall vals
30798 A 1400 task conf 2
30798 A 1400 task chall 3
30798 A 1400 active conf 5
30798 A 1400 active chall 2
30798 A 1400 sleep conf 6
30798 A 1400 sleep chall 1
30798 A 1400 morn conf 4
30798 A 1400 morn chall 2
30895 A 1200 task conf 6
30895 A 1200 task chall 2
30895 A 1200 active conf 5
30895 A 1200 active chall 3
30895 A 1200 sleep conf 5
30895 A 1200 sleep chall 2
30895 A 1200 morn conf 5
30895 A 1200 morn chall 3
32678 B 1000 task conf 5
32678 B 1000 task chall 3
32678 B 1000 active conf 6
32678 B 1000 active chall 3
32678 B 1000 sleep conf 6
32678 B 1000 sleep chall 2
32678 B 1000 morn conf 5
32678 B 1000 morn chall 2
34679 A 1300 task conf 4
34679 A 1300 task chall 3
34679 A 1300 active conf 4
34679 A 1300 active chall 2
34679 A 1300 sleep conf 6
34679 A 1300 sleep chall 1
34679 A 1300 morn conf 6
34679 A 1300 morn chall 3
35999 A 1400 task conf 2
35999 A 1400 task chall 2
35999 A 1400 active conf 2
35999 A 1400 active chall 2
35999 A 1400 sleep conf 2
35999 A 1400 sleep chall 2
35999 A 1400 morn conf 2
35999 A 1400 morn chall 2
I tried using the following codes, but the output is incorrect in both cases.
library(reshape2)
df_wide = recast(df_long, userID ~ c('action','confChall','vals'),
id.var = c("userID", "SAT", "GRE"))
df_wide = dcast(df_long, userID + SAT + GRE ~ c(action + ConfChall), value.var = "vals")
I tried to follow the example codes from the following pages. But I am having difficulty in applying these to my problem. Any advice or suggestion on this would be greatly appreciated.
Reshape data from long to wide format - more than one variable