Select the maximum value over multiple columns in R dplyr

Asked Oct 23 '22 at 12:29

Active Oct 23 '22 at 12:38

Viewed 20 times

My large data frame looks like this. I want to group_by gene and keep from the rep1 and rep2 columns the maximum value.

library(tidyverse)

df <- tibble(gene=c("A","A","B","B","B","C","C"),
       transcript=c("t1","t2","t1","t2","t3","t1","t2"), 
       rep1=c(270,10,40,50,100,10,11), 
       rep2=c(270,272,40,100,60,10,11))

df
#> # A tibble: 7 × 4
#>   gene  transcript  rep1  rep2
#>   <chr> <chr>      <dbl> <dbl>
#> 1 A     t1           270   270
#> 2 A     t2            10   272
#> 3 B     t1            40    40
#> 4 B     t2            50   100
#> 5 B     t3           100    60
#> 6 C     t1            10    10
#> 7 C     t2            11    11

^{Created on 2022-10-23 with reprex v2.0.2}

I want my data to look like this

   gene    rep1  rep2
  <chr>  <dbl> <dbl>
    A       270   272
    B       100    100
    C       11    11

asked Oct 23 '22 at 12:29

LDT

2,856
2
15
32

1

`df %>% group_by(gene) %>% sumarize(across(c(rep1, rep2), max))` – Gregor Thomas Oct 23 '22 at 12:34

Select the maximum value over multiple columns in R dplyr

0 Answers0