How to get the maximum value from one column grouped by other columns in R data frame?

Question

This is my dataframe

DF <- data.frame(plot=c(1,1,1,2,2,3,3,3), 
        id=c("A","B","A","B","B","C","B","C"),
        share=c(0.2,0.6,0.2,0.45,0.55,0.3,0.4,0.3))

What I need is to get per plot the id with the maximum share, so the final data frame would look like this:

max <- data.frame(plot=c(1,2,3),
         id=c("B","B","C"))

I know that there is a way with data.table and it will usually work, but I want to avoid it, because I need this inside of a function, and for some reason the function does not run with the data.table package loaded, because there are several functions of base R that overlap, so if possible I would like another approach.

plot 3 has a max share of 0.4, giving B not C as you have above — pluke, Apr 18 '23 at 08:42
@pluke, no because the two shares of C should be added together — starski, Apr 18 '23 at 08:44
why is plot 1 giving B, the sum of A is also 0.6: `DF %>% group_by(plot, id) %>% summarise(plot_sum = sum(share))` — pluke, Apr 18 '23 at 08:47
Have also a look at [Select the row with the maximum value in each group](https://stackoverflow.com/questions/24558328), [Extract row corresponding to minimum value of a variable by group](https://stackoverflow.com/questions/24070714) and [How to sum a variable by group](https://stackoverflow.com/questions/1660124) — GKi, Apr 18 '23 at 08:58

pluke · Accepted Answer · 2023-04-18T08:49:16.323

1

in the tidyverse you could do the following

library(tidyverse)
DF %>% 
  group_by(plot, id) %>% 
  summarise(plot_sum = sum(share)) %>%
  filter(plot_sum == max(plot_sum)) %>%
  select(-plot_sum)

edited Apr 18 '23 at 08:49

answered Apr 18 '23 at 08:41

pluke

3,832
5
45
68

As mentioned above, this approach does not summarizes the ```share``` for same ```id```'s per ```plot```. – starski Apr 18 '23 at 08:47
have fixed, see above – pluke Apr 18 '23 at 08:49

GKi · Answer 2 · 2023-04-18T08:50:37.030

0

In base you can first use aggregate and then ave to get max per group.

aggregate(share ~ ., DF, sum) |>
  (\(.) .[ave(.$share, .$plot, FUN=max) == .$share, c("plot", "id")])()
#  plot id
#1    1  B
#3    2  B
#5    3  C

edited Apr 18 '23 at 08:50

answered Apr 18 '23 at 08:44

GKi

37,245
2
26
48

How to get the maximum value from one column grouped by other columns in R data frame?

2 Answers2