I have the following toy data:
Gene | cell1 | cell2 |
---|---|---|
Gene1 | 1 | 12 |
Gene1 | 9 | 1 |
Gene2 | 0 | 0 |
Gene3 | 6 | 11 |
df <- data.frame(
Gene= c("Gene1","Gene1","Gene2","Gene3"),
gene_1 = c(1,9,0,6),
gene_2 = c(12,1,0,11)
)
I want to group by gene name and sum the value of other columns if they are duplicated.
Gene | cell1 | cell2 |
---|---|---|
Gene1 | 10 | 13 |
Gene2 | 0 | 0 |
Gene3 | 6 | 11 |
I use the following code to complete this task, but I cannot use it for my actual data because it is quite large and the following code is very slow.
df <- df %>%
group_by(Gene) %>%
summarise(across(everything(), sum)) %>%
ungroup()
Is there other, less computationally expensive, ways to complete this task? Thank you.