How to group by and get the value for column Y having X max?

Question

I have an use-case that I have not came across before. I have the following data frame and would like to select values of "y" where "x" achieves its minimum and maximum respectively for each level of the condition "i".

> library(dplyr) 
> df <- data.frame(i=c(1,1,2,2),x=c(1.0,2.0,3.0,4.0),y=c('a','b','c','d'))
> ddply(df, .(i), summarise, Min=min(x), Max=max(x))
  i Min Max
  1   1   2
  2   3   4

which is correct but I'd like to instead have the y whose x is Min or Max.

  i Min Max
  1   a   b
  2   c   d

How can I do that?

score 4 · Answer 1 · answered Dec 29 '17 at 16:30

4

We can use slice

library(dplyr)
df %>% 
   group_by(i) %>% 
   slice(which.min(x)) %>%
   #or
   #slice(which.max(x)) %>%
   select(-x)

answered Dec 29 '17 at 16:30

akrun

874,273
37
540
662

score 3 · Answer 2 · answered Dec 29 '17 at 16:32

3

Another option if you are willing to go outside of the tidyverse is data.table:

setDT(df)[, list(min = y[which.min(x)],
                 max = y[which.max(x)]), by = i]

#   i min max
#1: 1   a   b
#2: 2   c   d

answered Dec 29 '17 at 16:32

Mike H.

13,960
2
29
39

score 3 · Accepted Answer · answered Dec 29 '17 at 16:33

3

     library(plyr)
     df <- data.frame(i=c(1,1,2,2),x=c(1.0,2.0,3.0,4.0),y=c('a','b','c','d'))
     ddply(df, .(i), summarise, Min=y[which.min(x)], Max=y[which.min(x)])

answered Dec 29 '17 at 16:33

jrlewi

486
3
8

I liked this one because it is the easiest/closest to my OP use-case in terms of dependency and simplicity. – SkyWalker Dec 29 '17 at 18:31

score 1 · Answer 4 · answered Dec 29 '17 at 17:00

A solution in base R:

output <- by(df, df[, "i"], with, {
  data.frame(i=i[1], min=y[which.min(x)], max=y[which.max(x)])
})

Gives

> output
df[, "i"]: 1
  i min max
1 1   a   b
------------------------------------------------------------ 
df[, "i"]: 2
  i min max
1 2   c   d

(the data.frame is necessary to preserve the factor structure of "y" I believe).

The output can be concatenated with do.call(rbind, output)

> do.call(rbind, output)
  i min max
1 1   a   b
2 2   c   d

How to group by and get the value for column Y having X max?

4 Answers4

Linked