0

I have a similar looking df for which I want to calculate yearly sums and mean. Is there a quicker and more clever way to get exact the same output? Especially calculating the mean feels clumsy. I tried working with mean() but didn“t succeed.

a <- 1:10
b <- 5:14
c <- 36:45
d <- 22:31
year <- rep(2010:2014, 2)

df <- data.frame(a,b,c,d, year)

df[2,c(1,3,4)] <- NA
df[5,c(1,4)] <- NA
df[8,c(1,2)] <- NA


df_sums <- df %>% select(a, b, c, d, year) %>%
  group_by(year) %>%
  summarise(a_sum=sum(a, na.rm = T),
            b_sum=sum(b, na.rm = T),
            c_sum=sum(c, na.rm = T),
            d_sum=sum(d, na.rm = T),
            mean=(a_sum +
                  b_sum +
                  c_sum +
                  d_sum)/4)
EmKau
  • 91
  • 5

1 Answers1

1

You can use across to calculate sum for various columns by group. To calculate mean you can use rowMeans for selected columns.

library(dplyr)

df %>%
  group_by(year) %>%
  summarise(across(a:d, sum, na.rm = TRUE)) %>%
  ungroup %>%
  mutate(mean = rowMeans(select(., a:d), na.rm = TRUE))

#   year     a     b     c     d  mean
#  <int> <int> <int> <int> <int> <dbl>
#1  2010     7    15    77    49  37  
#2  2011     7    17    42    28  23.5
#3  2012     3     7    81    53  36  
#4  2013    13    21    83    55  43  
#5  2014    10    23    85    31  37.2
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213