2

I have a dataset of students' report card marks that range from D- to A+. I'd like to recode them into scale of 1-12 (i.e. D- = 1, D = 2 ... A = 11, A+ = 12). Right now I'm suing the revalue function in plyr. I have several columns that I'd like to recode - is there a shorter way to do this than running revalue on each column?

Some data:

student  <- c("StudentA","StudentB","StudentC","StudentD","StudentE","StudentF","StudentG","StudentH","StudentI","StudentJ")
read <- c("A", "A+", "B", "B-", "A", "C", "C+", "D", "C", "B+")
write <- c("A-", "B", "C", "C+", "D", "B-", "B", "C", "D+", "B")
math <- c("C", "C", "D+", "A", "A+", "B", "B-", "C-", "D+", "C")

df <- data.frame (student, read, write, math)

Right now I'm recoding them like this:

df$read.r <- as.numeric (revalue (df$read, c("D-" = "1",
                             "D" = "2",
                             "D+" = "3",
                             "C-" = "4",
                             "C" = "5",
                             "C+" = "6",
                             "B-" = "7",
                             "B" = "8",
                             "B+" = "9",
                             "A-" = "10",
                             "A" = "11",
                             "A+" = "12"
                             )))

Instead of running this 3 times (or more) is there a better way? All of the columns have identical values.

GregRousell
  • 997
  • 2
  • 13
  • 23

3 Answers3

1

You can use match(). First put all the grades in a vector from worst to best

marks <- c("D-", "D", "D+", "C-", "C", "C+", "B-", "B", "B+", "A-", "A", "A+").

Then

df$read.mark <- match(df$read, marks)

To avoid writing it three times just put it in a function or apply it column-wise using apply() like apply(df[2:4], 2, match, marks)

konvas
  • 14,126
  • 2
  • 40
  • 46
1
 df1 <- df[,-1]

df1[] <- as.numeric(factor(unlist(df[,-1]), 
         levels=paste0(rep(LETTERS[4:1], each=3), c("-", "", "+"))))
cbind(df, setNames(df1, paste(colnames(df1), "r", sep=".")))
#       student read write math read.r write.r math.r
#1  StudentA    A    A-    C     11      10      5
#2  StudentB   A+     B    C     12       8      5
#3  StudentC    B     C   D+      8       5      3
#4  StudentD   B-    C+    A      7       6     11
#5  StudentE    A     D   A+     11       2     12
#6  StudentF    C    B-    B      5       7      8
#7  StudentG   C+     B   B-      6       8      7
#8  StudentH    D     C   C-      2       5      4
#9  StudentI    C    D+   D+      5       3      3
#10 StudentJ   B+     B    C      9       8      5
akrun
  • 874,273
  • 37
  • 540
  • 662
0

Not super efficient, but will at least solve your 3x problem:

df[,2:4] <- revalue(as.matrix(df[,2:4]), c("B"=9))
eddi
  • 49,088
  • 6
  • 104
  • 155
  • this doesnt work for me. although using "as.matrix", the error is: " Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list?" --> suggesting the use of lapply with revalue? – dieHellste Nov 29 '17 at 09:04