1

I'd like to get the max value in each column for each ID. See the table blow as an example:

ID <-c("A","A","A","A","A","B","B","C","C","C")
col1 <-c(1,2,3,4,5,1,2,1,2,3)
col2 <-c(2,4,9,8,10,2,4,2,4,6)
df <-data.frame(ID, col1, col2)

I'd like to consolidate the table as shown below - the maximum for each group.

ID      col1      col2
A       5         10       
B       2         4
C       3         6

Since I will be applying this to a list of dataframes, each with an indeterminate amount of columns with various column names, I cannot have specific references to the names of the columns, but rather applied to all columns except the ID.

I know the following code works to max one variable, but cannot be expanded to multiple:

require(data.table)
dt <- as.data.table(df)
dt[, .SD[which.max(col1)], by=ID]

Your help is greatly appreciated!

Henrik
  • 65,555
  • 14
  • 143
  • 159
  • [Summarizing multiple columns with data.table](https://stackoverflow.com/questions/16513827/summarizing-multiple-columns-with-data-table); `df[ , lapply(.SD, max), by = ID]`. – Henrik Mar 01 '21 at 23:08
  • ...and don't use `data.frame(cbind` - the `cbind` will coerce to matrix and all columns to one class (here `character`). – Henrik Mar 01 '21 at 23:17

1 Answers1

2

For summarisation, tidyverse is more flexible especially the across

library(dplyr)
df %>%
   group_by(ID) %>%
   summarise(across(everything(), max))

-output

# A tibble: 3 x 3
#  ID     col1  col2
#* <chr> <dbl> <dbl>
#1 A         5    10
#2 B         2     4
#3 C         3     6

data

 df <- data.frame(ID, col1, col2)
akrun
  • 874,273
  • 37
  • 540
  • 662