1

Here is an example about what I want:

set.seed(123)    
data<-data.frame(X=rep(letters[1:3], each=4),Y=sample(1:12,12),Z=sample(1:100, 12))
setDT(data)

What I would like to do is to select the unique row of X with minimum Y and the next closer value to the minimum

Desired output

>data
a 4 68
a 5 11
b 1 4
b 10 89
c 2 64
c 3 82

The min value is already answer in this post How to select rows by group with the minimum value and containing NAs in R

data[, .SD[which.min(Y)], by=X]

But how to do it with the minimum and the next closer?

Community
  • 1
  • 1
user2380782
  • 1,542
  • 4
  • 22
  • 60
  • Assuming your data frame is a data table, what about `data[rank(Y) %in% 1:2 , ]` or, for a regular data frame `data[rank(data$Y) %in% 1:2, ]`? – eipi10 Jun 14 '16 at 16:39
  • 1
    thanks @eipi10, I have used `data[, .SD[rank(Y) %in% 1:2] , by=X]`, and it worked. If you answer my question I'll give you the credit :-) – user2380782 Jun 14 '16 at 16:47
  • Ah, sorry, I missed the fact that you were also grouping by `X`. Feel free to answer the question yourself. There's no problem with answering your own question. – eipi10 Jun 14 '16 at 16:53
  • You deserve it, please write a brief answer :-) – user2380782 Jun 14 '16 at 17:08

1 Answers1

3

For the ungrouped case, for a data.table you can do:

data[rank(Y) %in% 1:2, ]

For the grouped case, you can do:

data[ , .SD[rank(Y) %in% 1:2] , by=X]
   X  Y  Z
1: a  4 68
2: a  5 11
3: b  1  4
4: b 10 89
5: c  3 82
6: c  2 64
eipi10
  • 91,525
  • 24
  • 209
  • 285
  • 1
    Shameless plug for my eponym: data.table also has an `frank()` function. Here's the standard reference for the grouped case: http://stackoverflow.com/a/16574176/ – Frank Jun 14 '16 at 18:49
  • 1
    @Frank, I didn't realize you pronounced your name "eff-rank". – eipi10 Jun 14 '16 at 19:35