Select 3 rows which have the smallest value among all values in the column p

Question

I have a data frame df. I would like to select 3 rows which have the smallest value in the column p.

df

     p      b
as   0.6    ab
yu   0.3    bc
hy   0.05   ak
get  0.7    ka

result

     p      b
as   0.6    ab
yu   0.3    bc
hy   0.05   ak

r2evans · Accepted Answer · 2020-07-24T17:48:57.320

Two approaches:

df[df$p <= sort(df$p)[3],]
#       p  b
# as 0.60 ab
# yu 0.30 bc
# hy 0.05 ak

One problem with this is that when there are ties (for third) in p, you will get more than 3 rows. Also, this will not work well when there are fewer than 3 rows.

Another approach, if you don't care about the order:

head(df[order(df$p),], n = 3)

which has the advantage that it will always give the minimum of 3 or the actual number of rows. One problem with this is that it will not tell you that there is a tie, it'll just cap the number of rows.

(One could mitigate the re-ordering by adding a column with the pre-arranged order, then re-arrange on that column post head.)

Over to you which flow makes more sense.

Edit: an option that preserves order:

df[ rank(df$p) < 4,]

(inspired by @NotThatKindODr's suggested use of the ordered row_number() %in% 1:3)

nniloc · Answer 2 · 2021-09-18T02:44:57.363

2

Another option using dplyr::slice_min

library(dplyr)

df %>% slice_min(p, n = 3)

edited Sep 18 '21 at 02:44

answered Jul 24 '20 at 17:36

nniloc

4,128
2
11
22

score 0 · Answer 3 · answered Jul 24 '20 at 17:30

0

You can sort your data on p and then filter for the row number in 1:x

library(tidyverse)
df %>% 
  arrange(p) %>% 
  filter(row_number() %in% 1:3)

answered Jul 24 '20 at 17:30

NotThatKindODr

729
4
14

Select 3 rows which have the smallest value among all values in the column p

3 Answers3