0

This is my dataframe

DF <- data.frame(plot=c(1,1,1,2,2,3,3,3), 
        id=c("A","B","A","B","B","C","B","C"),
        share=c(0.2,0.6,0.2,0.45,0.55,0.3,0.4,0.3))

What I need is to get per plot the id with the maximum share, so the final data frame would look like this:

max <- data.frame(plot=c(1,2,3),
         id=c("B","B","C"))

I know that there is a way with data.table and it will usually work, but I want to avoid it, because I need this inside of a function, and for some reason the function does not run with the data.table package loaded, because there are several functions of base R that overlap, so if possible I would like another approach.

starski
  • 141
  • 6
  • 1
    plot 3 has a max share of 0.4, giving B not C as you have above – pluke Apr 18 '23 at 08:42
  • @pluke, no because the two shares of C should be added together – starski Apr 18 '23 at 08:44
  • ah, ok, I'll amend – pluke Apr 18 '23 at 08:45
  • 1
    why is plot 1 giving B, the sum of A is also 0.6: `DF %>% group_by(plot, id) %>% summarise(plot_sum = sum(share))` – pluke Apr 18 '23 at 08:47
  • my bad, that was a typo, I'll edit – starski Apr 18 '23 at 08:48
  • 1
    Have also a look at [Select the row with the maximum value in each group](https://stackoverflow.com/questions/24558328), [Extract row corresponding to minimum value of a variable by group](https://stackoverflow.com/questions/24070714) and [How to sum a variable by group](https://stackoverflow.com/questions/1660124) – GKi Apr 18 '23 at 08:58

2 Answers2

1

in the tidyverse you could do the following

library(tidyverse)
DF %>% 
  group_by(plot, id) %>% 
  summarise(plot_sum = sum(share)) %>%
  filter(plot_sum == max(plot_sum)) %>%
  select(-plot_sum)
pluke
  • 3,832
  • 5
  • 45
  • 68
0

In base you can first use aggregate and then ave to get max per group.

aggregate(share ~ ., DF, sum) |>
  (\(.) .[ave(.$share, .$plot, FUN=max) == .$share, c("plot", "id")])()
#  plot id
#1    1  B
#3    2  B
#5    3  C
GKi
  • 37,245
  • 2
  • 26
  • 48