Name columns in aggregate while using lapply

Question

there already was a similar question which answer helped somewhat, but I couldn't translate it to my use case when using aggregate within lapply. With setNames I can specify a string, but I'm having a hard time pulling out the column name lapply is currently working on to use in setNames.

So, I have a df.

head(rms)
  file      date min fullband  band1 band2 band3  band4 band5 hr
1    0 2015/1/14   0   112.17 112.43 94.13 97.92 102.17 96.87  0
2    1 2015/1/14   5   111.73 110.71 94.01 96.78 102.20 96.90  0
3    2 2015/1/14  10   109.08 107.05 91.81 96.68 102.40 97.01  0
4    3 2015/1/14  15   110.74 109.24 93.14 96.65 102.02 96.87  0
5    4 2015/1/14  20   108.82 107.09 93.16 96.50 102.08 96.84  0

And I aggregate the columns fullband-band5 like this:

 rms.byhr<-lapply(rms[-c(1:3,10)], function(x){
aggregate(x, by=list(rms$hr), mean)
})

However, naturally, lapply will use the column name for the list elements and replace the names of the df it creates with something arbitrary (Group.1 and x).

I tried:

rms.byhr<-lapply(rms[-c(1:3,10)], function(x){
setNames(aggregate(x, by=list(rms$hr), mean), c("Hour", names(x))
})

and

rms.byhr<-lapply(rms[-c(1:3,10)], function(x){
setNames(aggregate(x, by=list(rms$hr), mean), c("Hour", names(rms)[which(names(rms)==names(x))]))
})

But that doesn't seem to work and returns NA. So I guess my question really is, how does "x" look like in lapply and how to I index/pull out the name properly?

I need them named for subsequent functions.

No need for `lapply`: `aggregate(. ~ hr, data = d[-(1:3)], mean)`. — Henrik, Apr 08 '18 at 20:07
Oh, that works beautifully! Thank you! It also spares me one follow-up step (merging my dfs from the list). I didn't even think about that - sometimes one really thinks too complicated (but in my defense, I'm still learning). — Anke, Apr 08 '18 at 20:13

score 0 · Answer 1 · answered Apr 08 '18 at 20:12

it is not entirely clear (to me) exactly what you want as the output and in what format. If you want is as a list, you can wrap it in unname:

rms.byhr<-lapply(rms[-c(1:3,10)], function(x) {
  unname(aggregate(x, by=list(rms$hr), mean))
})

But aggregate can also aggregate more than one column, which renders lapply unneeded:

aggregate(. ~ hr, data = rms[-(1:3)], mean)

Edit: I see now @Henrik has answered your post in the comments. I'll leave this answer here for posterity.

Name columns in aggregate while using lapply

1 Answers1