subset dataframe with topmost values based on a category in r

Question

I have a data frame like this

Type    value
cellA   2.02
cellA   2.56
cellB   1.24
cellB   2.34
cellB   4.56
cellC   3.55
cellC   2.36
cellC   6.78
cellC   3.56

and I want to subset it based on the topmost value for each type,,, so the output would be

Type    value
cellA   2.56
cellB   4.56
cellC   6.78

How can I achieve this in R - does the unique command can be of any help - I am bit struck::: Thanks for the suggestions

M

you can do `library(data.table);setDT(df)[,list(val=max(value)), type]` — Colonel Beauvel, Jul 23 '15 at 12:02
Or just `aggregate(value ~ Type, df, max)` with just base R. With `data.table` I would go with `unique(setDT(df)[order(-value)], by = "Type")` — David Arenburg, Jul 23 '15 at 12:03

score 1 · Answer 1 · answered Jul 23 '15 at 12:03

using dplyr this can be done with top_n

library(dplyr)
# assume your data is in data frame df2
df2 %>% group_by(Type) %>% top_n(1)

you get

Selecting by value
Source: local data frame [3 x 2]
Groups: Type

   Type value
1 cellA  2.56
2 cellB  4.56
3 cellC  6.78

Andrew Taylor · Answer 2 · 2015-07-23T12:07:30.797

0

I like to use dplyr for this:

dat %>% group_by(Type) %>% arrange(-value) %>%  filter(row_number()==1)

but top_n definitely is better, but I hadn't come across that before.

edited Jul 23 '15 at 12:07

answered Jul 23 '15 at 12:01

Andrew Taylor

3,438
1
26
47

score 0 · Answer 3 · answered Jul 23 '15 at 12:07

0

Use function aggregate.

Example:

x<data.frame(Type=c('cellA','cellA','cellB','cellB','cellB','cellC','cellC','cellC','cellC'),
value =c(2.02,2.56,1.24,2.34,4.56,3.55,2.36,6.78,3.56)
)
aggregate(value~Type,max,data = x)

answered Jul 23 '15 at 12:07

Bruno Henrique

1
1

subset dataframe with topmost values based on a category in r

3 Answers3