-1

I have a data frame that has both a list of scores and names corresponding to the scores. Some individuals appear more than once, but I only want to take the best score from each individual. A sample of the data frame is provided below.

      V1          Names  
1574  98.76        Lebron James             
1587  98.33        Lebron James             
1588  97.32        Lebron James             
1713  65.97        Dwyane Wade            
1730  100.4        Chris Paul             
1734  98.38        Chris Paul

So, in the final form of my data frame, all rows would be deleted except for rows 1574, 1713, and 1730 (keeping the highest score for each individual). What is the best way to go about coding a problem like this one?

missuse
  • 19,056
  • 3
  • 25
  • 47
K.M
  • 3
  • 2

1 Answers1

0

An approach using tidyverse would be:

library(tidyverse)

df %>%
  group_by(Names) %>%
  summarise(maxd = max(V1))

After grouping by Names variable summarize the groups by using the function max on variable V1 and store that in a new variable calling maxd

In base R:

aggregate(V1 ~ Names, data = df, max)
missuse
  • 19,056
  • 3
  • 25
  • 47