Subset a dataset to leave the largest 2 values

Question

I have a data set:

How can I subset it so as to "leave" the top 2 col1 values. So my output is this:

col1 col2 

A     3
A     3
A     3
D     5
A     3

I have viewed this question, but it didn't answer my question.

*"subset it so as to "leave" the top 2 col1 values"* is confusing me. Do you mean "keep the rows that have the top 2 `col2` values?" — Gregor Thomas, Nov 17 '20 at 17:19

score 2 · Accepted Answer · answered Nov 17 '20 at 17:20

2

Try this, but not sure why you only have one D:

#Code
newdf <- df[df$col2 %in% sort(unique(df$col2),decreasing = T)[1:2],]

answered Nov 17 '20 at 17:20

Duck

39,058
13
42
84

score 2 · Answer 2 · answered Nov 17 '20 at 17:21

I assume that your data is in a data.frame.

First of all, you need to get the top 2 values of col2. Therefore you can take the unique values of it, sort them in decreasing order, and take the first two elements:

col2Values <- unique(df$col2)
top2Elements <- sort(col2Values,decreasing = TRUE)[c(1,2)]

Now you know the top2 values, so you just need to check where these values appear in col2. This can be done via:

df[df$col2 %in% top2Elements,]

Update: Now it should work, I had some typos in there.

Subset a dataset to leave the largest 2 values

2 Answers2