1

I'm new to R and I've done my best googling for the answer to the question below, but nothing has come up so far.

In Excel you can keep a specific column or row constant when using a reference by putting $ before the row number or column letter. This is handy when performing operations across many cells when all cells are referring to something in a single other cell. For example, take a dataset with grades in a course: Row 1 has the total number of points per class assignment (each column is an assignment), and Rows 2:31 are the raw scores for each of 30 students. In Excel, to calculate percentage correct, I take each student's score for that assignment and refer it to the first row, holding row constant in the reference so I can drag down and apply that operation to all 30 rows below Row 1. Most importantly, in Excel I can also drag right to do this across all columns, without having to type a new operation.

What is the most efficient way to perform this operation--holding a reference row constant while performing an operation to all other rows, then applying this across columns while still holding the reference row constant--in R? So far I had to slice the reference row to a new dataframe, remove that row from the original dataframe, then type one operation per column while manually going back to the new dataframe to look up the reference number to apply for that column's operation. See my super-tedious code below.

For reference, each column is an assignment, and Row 1 had the number of points possible for that assignment. All subsequent rows were individual students and their grades.

# Extract number of points possible
outof <- slice(grades, 1)

# Now remove that row (Row 1)
grades <- grades[-c(1),]

# Turn number correct into percentage. The divided by
# number is from the sliced Row 1, which I had to
# look up and type one-by-one. I'm hoping there is
# code to do this automatically in R.
grades$ExamFinal < (grades$ExamFinal / 34) * 100
grades$Exam3 <- (grades$Exam3 / 26) * 100
grades$Exam4 <- (grades$Exam4 / 31) * 100
grades$q1.1 <- grades$q1.1 / 6
grades$q1.2 <- grades$q1.2 / 10
grades$q1.3 < grades$q1.3 / 6
grades$q2.2 <- grades$q2.2 / 3
grades$q2.4 <- grades$q2.4 / 12
grades$q3.1 <- grades$q3.1 / 9
grades$q3.2 <- grades$q3.2 / 8
grades$q3.3 <- grades$q3.3 / 12
grades$q4.1 <- grades$q4.1 / 13
grades$q4.2 <- grades$q4.2 / 5
grades$q6.1 <- grades$q6.1 / 5
grades$q6.2 <- grades$q6.2 / 6
grades$q6.3 <- grades$q6.3 / 11
grades$q7.1 <- grades$q7.1 / 7
grades$q7.2 <- grades$q7.2 / 8
grades$q8.1 <- grades$q8.1 / 7
grades$q8.3 <- grades$q8.3 / 13
grades$q9.2 <- grades$q9.2 / 13
grades$q10.1 <- grades$q10.1 / 8
grades$q12.1 <- grades$q12.1 / 12
Christina
  • 21
  • 1
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Jul 12 '20 at 22:06

4 Answers4

3

You can use sweep

100*sweep(grades, 2, outof, "/")

#  ExamFinal  EXam3 EXam4
#1    100.00  76.92 32.26
#2     88.24  84.62 64.52
#3     29.41 100.00 96.77

Data:

grades
  ExamFinal EXam3 EXam4
1        34    20    10
2        30    22    20
3        10    26    30

outof
[1] 34 26 31

grades <- data.frame(ExamFinal=c(34,30,10),
                     EXam3=c(20,22,26),
                     EXam4=c(10,20,30))
outof <- c(34,26,31)
Edward
  • 10,360
  • 2
  • 11
  • 26
1

You can use mapply on the original grades dataframe (don't remove the first row) to divide rows by the first row. Then convert the result back to a dataframe.

as.data.frame(mapply("/", grades[2:31, ], grades[1, ]))
neilfws
  • 32,751
  • 5
  • 50
  • 63
0

The easiest way is to use some type of loop. In this case I am using the sapply function. To all of the elements in each column by the corresponding total score.

#Example data
outof<-data.frame(q1=c(3), q2=c(5))
grades<-data.frame(q1=c(1,2,3), q2=c(4,4, 5))

answermatrix <-sapply(1:ncol(grades), function(i) {
   #grades[,i]/outof[i]   #use this if "outof" is a vector
   grades[,i]/outof[ ,i]   
})
answermatrix
Dave2e
  • 22,192
  • 18
  • 42
  • 50
-1

A loop would probably be your best bet.

The first part you would want to extract the most amount of points possible, as is listed in the first row, then use that number to calculate the percentage in the remaining rows per column:

`
j = 2 #sets the first row to 2 for later
for (i in 1:ncol(df) {
a <- df[1,] #this pulls the total points into a
#then we compute using that number
while(j <= nrow(df)-1){ #subtract the number of rows from removing the first 
#row
b <- df[j,i] #gets the number per row per column that corresponds with each 
#student
df[j,i] <- ((a/b)*100) #replaces that row,column with that percentage
j <- j+1 #goes to next row
  }
}
`

The only drawback to this approach is data-frames produced in functions aren't copied to the global environment, but that can be fixed by introducing a function like so:

    f1 <- function(x = <name of df> ,y= <name you want the completed df to be 
    called>) {
    j = 2
    for (i in 1:ncol(x) {
    a <- x[1,] 
    while(j <= nrow(x)-1){ 
    b <- df[j,i]
    x[j,i] <- ((a/b)*100)
    i <- i+1
         }
      }
    arg_name <- deparse(substitute(y)) #gets argument name
    var_name <- paste(arg_name) #construct the name
    assign(var_name, x, env=.GlobalEnv) #produces global dataframe
    }
Justin Cocco
  • 392
  • 1
  • 6
  • Yikes. These look like poor suggestions for someone who is new to R. Often when writing R code you don't need explicit loops and you really should avoid `get/assign`. Those are usually signs you're doing doing things in an very R-like way. – MrFlick Jul 13 '20 at 03:47
  • Sorry! I’ve only been coding for a month now so I don’t know how else to do it – Justin Cocco Jul 14 '20 at 19:46