R Subset data.frame from max value of one vector and grouped by another

Question

>ID<-c('A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C')
>WK<-c(1, 2, 3, 1, 2, 3, 1, 2, 3, 4, 5)
>NumSuccess<-c(0, 0, 2, 0, 0, 1, 0, 0, 0, 0, 3)
>Data<-data.frame(ID, WK, NumSuccess)

I am trying to create a subset data.frame "Data2" based on the value in "NumSuccesses" that corresponds to the Max Value in "WK" grouped by "ID". Resulting data.frame should look like this:

>ID<-c('A','B','C')
>WK<-c(3, 3, 5)
>NumSuccess<-c(2, 1, 3)
>Data2<-data.frame(ID, WK, NumSuccess)

akrun · Accepted Answer · 2015-04-12T10:32:40.783

This could be done in more than one way. If there are ties for 'WK', maximum value per each 'ID' and want all the rows with the maximum 'WK', it may be useful to filter with the logical condition (WK==max(WK)) after grouping by the 'ID'.

library(dplyr)
Data %>% 
      group_by(ID) %>% 
      filter(WK==max(WK))
#   ID WK NumSuccess
#1  A  3          2
#2  B  3          1
#3  C  5          3

If there is a single 'max' value for 'WK' per 'ID', we can use which.max or use arrange to order the dataset by 'WK' for each 'ID'

  Data %>% 
       group_by(ID) %>%
       slice(which.max(WK))
  #     ID WK NumSuccess
  #1  A  3          2
  #2  B  3          1
  #3  C  5          3

 Data %>%
       group_by(ID) %>%
       arrange(-WK) %>% 
       slice(1)

Similar approaches in data.table are

library(data.table)
setDT(Data)[, .SD[max(WK)==WK], ID]
setDT(Data)[, .SD[which.max(WK)], ID]
setkey(setDT(Data), WK)[, .SD[.N], ID]

Or we can use ave from base R

 Data[with(Data, ave(WK, ID, FUN=max)==WK),]

@user3594490 No problem. Glad that it works for you – akrun Apr 12 '15 at 15:06 — akrun, Apr 12 '15 at 15:06

R Subset data.frame from max value of one vector and grouped by another

1 Answers1

Linked