0
aggregate(df$count, list(df$sport, df$color), sum)

Aggregate returns the below data frame, but I just want to return the maximum count for each color:

Sport      Color     Count
Baseball   Blue      5
Football   Blue      10
Basketball Blue      7
Baseball   Red       6
Football   Red       9
Basketball Red       13

I would want the code to return:

Sport      Color     Count
Football   Blue      10
Basketball Red       13

Because Football has the highest count in blue group and basketball has highest count in red group.

bschaible
  • 11
  • 2

2 Answers2

1

Maybe you can try subset + ave

> subset(df,ave(Count,Color,FUN = max)==Count)
       Sport Color Count
2   Football  Blue    10
6 Basketball   Red    13

Data

structure(list(Sport = c("Baseball", "Football", "Basketball", 
"Baseball", "Football", "Basketball"), Color = c("Blue", "Blue",
"Blue", "Red", "Red", "Red"), Count = c(5L, 10L, 7L, 6L, 9L,
13L)), class = "data.frame", row.names = c(NA, -6L))
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
1

Another base alternative:

do.call(rbind, by(df, df$Color, function(z) z[which.max(z$Count),]))
#           Sport Color Count
# Blue   Football  Blue    10
# Red  Basketball   Red    13

Or a tidy version:

library(dplyr)
df %>%
  group_by(Color) %>%
  slice(which.max(Count)) %>%
  ungroup()
# # A tibble: 2 x 3
#   Sport      Color Count
#   <chr>      <chr> <int>
# 1 Football   Blue     10
# 2 Basketball Red      13
r2evans
  • 141,215
  • 6
  • 77
  • 149