1

I am trying to automate averaging the data in a data frame that has the same value in a column.

Here the code for simulated data frame

col1 <- c(1,1,1,2,2,2,3,3,3)
col2 <- c(10,20,15,5,8,7,30,1,25)
col3 <- c(.5,.4,.2,.2,.2,.1,.4,.5,.9)
testdf <- data.frame(col1,col2,col3)

And the output from that data frame

testdf
  col1 col2 col3
1    1   10  0.5
2    1   20  0.4
3    1   15  0.2
4    2    5  0.2
5    2    8  0.2
6    2    7  0.1
7    3   30  0.4
8    3    1  0.5
9    3   25  0.9

What I m trying to do is get an output that gives me the averages of the values in columns 2 and 3 for all data with the same value in column 1 (i.e., the average for column 2 values when column 1 values are 1 is 15 and the average for column 3 when column 1 values are 1 is .367)

1 Answers1

2

We can use aggregate from base R

aggregate(.~ col1, testdf, mean)

Or with dplyr

library(dplyr)
testdf %>%
  group_by(col1) %>%
  summarise_all(mean)

Or with data.table

library(data.table)
setDT(testdf)[, lapply(.SD, mean), by = col1]
akrun
  • 874,273
  • 37
  • 540
  • 662