1

Suppose I have a dataframe as such:

d<- data.frame (type=c("rna","rna","rna"), value = c(1,2,3) )
d2 <- data.frame (type=c("dna","dna"), value = c(20,30) )
df <- rbind (d,d2)

It looks like this:

  type value
1  rna     1
2  rna     2
3  rna     3
4  dna    20
5  dna    30

Now I need to summarize by type, that is aggregate all types by summing AND dividing it by the total occurence. For example in the above example it would be for rna (1+1+3) / 3 and dna (20+30)/2 however currently I can only to get it to sum as such,

library(dplyr)
    df %>%
        group_by(type) %>%
        summarise_all(sum) %>%
        data.frame() 

the above code produces

  type value
1  rna     6
2  dna    50

whereas what I really want is this

  type value
1  rna     2
2  dna    25

thanks.

Ahdee
  • 4,679
  • 4
  • 34
  • 58

1 Answers1

1

We need to divide by the number of rows in each group (n())

df %>% 
   group_by(type) %>%
   summarise(value = sum(value)/n())

which is otherwise the mean

df %>%
  group_by(type) %>% 
  summarise(value = mean(value))
# A tibble: 2 x 2
#   type  value
#   <fct> <dbl>
#1 rna       2
#2 dna      25
akrun
  • 874,273
  • 37
  • 540
  • 662