0

I'd like to create an aggregation without knowing neither the column names nor their positions ie. I retrieve the names dynamically.

Further I'm able to use data.frame or data.table as I'm forced to use R version 3.1.1

Is there an option like do.call... as explained in this answer for 'order'

trying a similar do.call with 'aggregate' leads to an error

# generate a small dataset
set.seed(1234)
smalldat <- data.frame(group1 = rep(1:2, each = 5), 
                       group2 = rep(c('a','b'), times = 5), 
                       x = rnorm(10),
                       y = rnorm(10))

group_by <- c('group1','group2')

test <- do.call( aggregate.data.frame , c(by=group_by, x=smalldat, FUN=mean))
#output
#Error in is.data.frame(x) : Argument "x" missing (no default)

or is there an option with data.table?

# generate a small dataset
set.seed(1234)
smalldat <- data.frame(group1 = rep(1:2, each = 5), 
                       group2 = rep(c('a','b'), times = 5), 
                       x = rnorm(10),
                       y = rnorm(10))


# convert to data.frame to data.table
library(data.table)
smalldat <- data.table(smalldat)

# convert aggregated variable into raw data file

smalldat[, aggGroup1 := mean(x), by = group1]

Thanks for advice!

til
  • 832
  • 11
  • 27
  • `test <- do.call(aggregate.data.frame , list(by=smalldat[group_by], x=smalldat[!colnames(smalldat) %in% group_by], FUN=mean))` – Roland Apr 10 '18 at 13:51
  • What's wrong with `smalldat[, aggGroup1 := mean(x), .(group1,group2)]` – YOLO Apr 10 '18 at 13:56
  • @YOLO, the way Gregor describes, I'm able to fill the group_by dynamically with 'group1', 'group2' or even a completely new value in case the underlying data changes. – til Apr 10 '18 at 14:33

1 Answers1

2

aggregate can take a formula, and you can build a formula from a string.

form = as.formula(paste(". ~", paste(group_by, collapse = " + ")))
aggregate(form, data = smalldat, FUN = mean)
#   group1 group2          x           y
# 1      1      a  0.1021667 -0.09798418
# 2      2      a -0.5695960 -0.67409059
# 3      1      b -1.0341342 -0.46696381
# 4      2      b -0.3102046  0.46478476
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • Thanks for the fast response! @YOLO, that way I'm able to fill the group_by dynamically with 'group1', 'group2' or even a completely new value in case the underlying data changes. – til Apr 10 '18 at 14:14
  • YOLO won't see comments on my answer since they have only commented on the question. – Gregor Thomas Apr 10 '18 at 14:31