1

Using R, I'm trying to find a more efficient way to calculate the differences between the largest value in a column and each value in that same column. I was able to do this, but the code looks bulky (I created a vector where each value is the max value of the column), I'm hoping someone can demonstrate a more efficient method, perhaps using the apply command?

a<-data.frame("Group Name"=c('Group 1','Group 2', 'Group 3', 'Group 4','Group 5', 'Group 6'),
          "app 1"=c(28,28,27,28,29,28),
          "app 2"=c(32,31,29,33,35,32),
          "app 3"=c(44,43,42,45,46,44),
          "app 4"=c(48,48,47,48,49,48),
          "app 5"=c(38,36,35,39,41,38),
          "app 6"=c(26,26,25,26,27,26))

a$Avg_score=apply(a[,-1],1,mean)
a$max_mean_diff<-c(max(a$Avg_score),max(a$Avg_score),max(a$Avg_score),
             max(a$Avg_score),max(a$Avg_score),max(a$Avg_score))-a$Avg_score
View(a)

1 Answers1

0

You don't need apply here since you have rowMeans function which returns mean of every row. You can then subtract Avg_score with max of Avg_score. You don't need to repeat max(a$Avg_score) to make the lengths equal with a$Avg_score, R uses recycling technique to match the length of shorter object with that of longer object.

a$Avg_score <- rowMeans(a[-1], na.rm = TRUE)
a$max_diff <- max(a$Avg_score) - a$Avg_score
a
#  Group.Name app.1 app.2 app.3 app.4 app.5 app.6 Avg_score max_diff
#1    Group 1    28    32    44    48    38    26  36.00000 1.833333
#2    Group 2    28    31    43    48    36    26  35.33333 2.500000
#3    Group 3    27    29    42    47    35    25  34.16667 3.666667
#4    Group 4    28    33    45    48    39    26  36.50000 1.333333
#5    Group 5    29    35    46    49    41    27  37.83333 0.000000
#6    Group 6    28    32    44    48    38    26  36.00000 1.833333
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213