-1

I am trying to calculate the means of different columns, based on a criteria in R.

enter image description here

Now, I want to calculate the mean for every year and for every company.

I have tried this code, but it is not working:

DF %>% group_by("Year") %>% colMeans(DF, na.rm=TRUE)

Does anybody have an idea?

Li4991
  • 59
  • 5
  • dont give image use `dput(df)`t that would be helpful for all the people who can address your problem instead of creating your data by seeing and typing it again – PesKchan Jun 09 '23 at 10:22

2 Answers2

0

colMeans doesn't work with group_by. try this:

mean = DF %>%
  group_by(Year) %>%
  summarize(
    COMP1_mean = mean(COMP1, na.rm = TRUE),
    COMP2_mean = mean(COMP2, na.rm = TRUE),
    COMP3_mean = mean(COMP3, na.rm = TRUE),
    COMP4_mean = mean(COMP4, na.rm = TRUE)
  )

then just print the results.

or even better:

aggregate(DF[, 2:4], list(DF$Year), mean)

if you want to add a criteria:

criteria = DF$Year > 2003

then use it in your aggregate function:

results = aggregate(DF[criteria, 2:4], list(Year = DF[criteria, "Year"]), mean)

and print results.

0

It is very easy using the library data.table. Check here for info about data.table

# CREATE DUMMY DATA
library(data.table)

set.seed(1)
df <- data.frame(Year=rep(2001:2004, each=3), 
                 A = sample(1:10, 12, replace = T),
                 B = sample(1:10, 12, replace = T),
                 C = sample(1:10, 12, replace = T),
                 D = sample(1:10, 12, replace = T))
df$A[1] <- NA


# COMPUTE AVERAGE BY YEAR
dt <- as.data.table(df)
dt[, lapply(.SD, mean, na.rm=TRUE), by = "Year"]


# RESULTS
   Year        A        B        C        D
1: 2001 6.666667 7.666667 4.666667 7.333333
2: 2002 3.333333 6.333333 6.333333 8.000000
3: 2003 2.000000 7.666667 6.666667 8.000000
4: 2004 6.666667 5.666667 7.666667 6.666667
Gerald T
  • 704
  • 3
  • 18