4

Here's the code to get a sample data set:

set.seed(0)
practice <- matrix(sample(1:100, 20), ncol = 2)
data <- as.data.frame(practice)
data <- cbind( lob = sprintf("objective%d", rep(1:2,each=5)), data)
data <- cbind( student = sprintf("student%d", rep(1:5,2)), data)
names(data) <- c("student", "learning objective","attempt", "score")
data[-8,]

The data looks like this:

    student learning objective attempt score
1  student1         objective1      90     6
2  student2         objective1      27    19
3  student3         objective1      37    16
4  student4         objective1      56    60
5  student5         objective1      88    34
6  student1         objective2      20    66
7  student2         objective2      85    42
9  student4         objective2      61    82
10 student5         objective2      58    31

What I want is:

    student       objective1         objective2 
                 attempt  score     attempt score
1  student1         90     6          20      66
2  student2         27    19          85      42
3  student3         ...                0       0
4  student4         ...                  ...
5  student5         ...                  ...

There are 70 learning objectives, so it's going to be tedious to just copy and paste the attempts and scores, so I wonder whether there's a better way to clean the data.

R: I was trying to use the melt function in R to get the new data, but it doesn't work well. There are missing scores for some of the students and the student name is not listed, for example student3 in this case, so I cannot just cbind the score.

Excel: There are 70 learning objectives, and because of missing names, I have to check all corresponding rows for all those 70 objectives for VLOOKUP:

(=VLOOKUP($C7,'0learning.csv'!$B$372:$G$395,5,0)
(=VLOOKUP($C7,'0learning.csv'!$B$372:$G$395,6,0)

Is there a better way?

honk
  • 9,137
  • 11
  • 75
  • 83
SongTianyang
  • 139
  • 3
  • 9

1 Answers1

4

We could use the devel version of data.table i.e. v1.9.5 which can take take multiple value.var columns and reshape the 'long' form to 'wide'. Instructions to install are here.

 library(data.table)#v1.9.5+
 names(data)[2] <- 'objective'
 dcast(setDT(data), student~objective, value.var=c('attempt', 'score'))
 #    student attempt_objective1 attempt_objective2 score_objective1
 #1: student1                 90                 20                6
 #2: student2                 27                 85               19
 #3: student3                 37                 96               16
 #4: student4                 56                 61               60
 #5: student5                 88                 58               34
 #    score_objective2
 #1:               66
 #2:               42
 #3:               87
 #4:               82
 #5:               31

Or using reshape from base R

 reshape(data, idvar='student', timevar='objective', direction='wide')
 #  student attempt.objective1 score.objective1 attempt.objective2
 #  1 student1                 90                6                 20
 #  2 student2                 27               19                 85
 #  3 student3                 37               16                 96
 #  4 student4                 56               60                 61
 #  5 student5                 88               34                 58
 #    score.objective2
 #  1               66
 #  2               42
 #  3               87
 #  4               82
 #  5               31
Arun
  • 116,683
  • 26
  • 284
  • 387
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you, but it seems there are errors for both of the two lines of codes.. 1:> names(data)[2] <- 'objective' Warning message:...2:> dcast(setDT(data), student~objective, value.var=c('attempt', 'score')) Error in .subset2(x, i, exact = exact) : subscript out of bounds – SongTianyang Jun 16 '15 at 16:59
  • @SongTianyang Are you using the devel version of `data.table`? – akrun Jun 16 '15 at 17:00
  • @SongTianyang I added a `base R` version, it should work if you dont have the devel version of data.table – akrun Jun 16 '15 at 17:02
  • Thank you! I downloaded the package, but the default version is 1.9.4. – SongTianyang Jun 16 '15 at 17:03
  • The reshape one works! Thank you so much! – SongTianyang Jun 16 '15 at 17:04
  • @SongTianyang In the CRAN, it is 1.9.4. If you look at the link I mentioned in the post, it gives you access to download and install the devel versin – akrun Jun 16 '15 at 17:04