0

I have a data frame with several student IDs and the scores they obtained on various tests. However, there are instances where a student took the same test multiple times. I would like to only keep their highest score.

For example, I would like to turn

Student Subject Score
1 Math 96
1 Math 97
1 English 82
2 Math 85
2 English 72
2 English 75

into

Student Subject Score
1 Math 97
1 English 82
2 Math 85
2 English 75

I have tried

df[!duplicated(df[,c(1,2)]),]

but that just keeps the first of the repeated observations. How could I adjust this to keep the maximum?

Remy M
  • 599
  • 1
  • 4
  • 17
  • Are there other columns you might need to keep as well? You could just use `dplyr`'s group by if not: `library(dplyr); df %>% group_by(Student, Subject) %>% summarize(Score = max(Score))` – zack Jul 18 '18 at 18:39
  • 1
    `aggregate(Score ~ Student + Subject, data, max)`. – Rui Barradas Jul 18 '18 at 18:41
  • I don't have other columns I need to keep but I might in the future. Thank you! – Remy M Jul 18 '18 at 18:44

0 Answers0