1

I have a dataframe with two groups and values. I have to find max value by one group (group) and discover, to which values does my max correspond to in the second group (dist).

# example
df<-data.frame(group = rep(c("a", "b"), each = 5),
               val = 1:10,
               dist = rep(c("NR", "b1"), 5))


        > df
       group val dist
1      a   1   NR
2      a   2   b1
3      a   3   NR
4      a   4   b1
5      a   5   NR
6      b   6   b1
7      b   7   NR
8      b   8   b1
9      b   9   NR
10     b  10   b1

I can get the max values by group:

aggregate(val ~ group, df, max)

  group val
1     a   5
2     b  10

or by tapply:

tapply(df$val, df$group, max)

but I need to know, in what "dist" is max located.

  group val  dist
1     a   5   NR
2     b  10   b1

How to accomplish this?

maycca
  • 3,848
  • 5
  • 36
  • 67

2 Answers2

3

We can slice the row which have the max 'val' for each 'group'

library(dplyr)
df %>%
  group_by(group) %>%
  slice(which.max(val))

If there are ties for max value, then do a comparison and filter the rows

df %>%
  group_by(group) %>%
  filter(val == max(val))

Or with ave from base R

df[with(df, val == ave(val, group, FUN= max)),]
#    group val dist
#5      a   5   NR
#10     b  10   b1
akrun
  • 874,273
  • 37
  • 540
  • 662
1
df<-data.frame(group = rep(c("a", "b"), each = 5),
               val = 1:10,
               dist = rep(c("NR", "b1"), 5))

df1 <- split(df, df$group)
df2 <- lapply(df1, function(i) i[which(i$val== max(i$val)),] )
df3 <- do.call(rbind, df2)