0

My large data frame looks like this. I want to group_by gene and keep from the rep1 and rep2 columns the maximum value.

library(tidyverse)

df <- tibble(gene=c("A","A","B","B","B","C","C"),
       transcript=c("t1","t2","t1","t2","t3","t1","t2"), 
       rep1=c(270,10,40,50,100,10,11), 
       rep2=c(270,272,40,100,60,10,11))

df
#> # A tibble: 7 × 4
#>   gene  transcript  rep1  rep2
#>   <chr> <chr>      <dbl> <dbl>
#> 1 A     t1           270   270
#> 2 A     t2            10   272
#> 3 B     t1            40    40
#> 4 B     t2            50   100
#> 5 B     t3           100    60
#> 6 C     t1            10    10
#> 7 C     t2            11    11

Created on 2022-10-23 with reprex v2.0.2

I want my data to look like this

   gene    rep1  rep2
  <chr>  <dbl> <dbl>
    A       270   272
    B       100    100
    C       11    11

LDT
  • 2,856
  • 2
  • 15
  • 32

0 Answers0