aggregate over multiple columns

Question

Hey I have some data looks like this:

ExpNum  Compound Peak Tau SS
1        a       100  30  50
2        a       145  23  45
3        b       78   45  56
4        b       45   43  23
5        c       344  23  56

Id like to fund the mean based on Compound name What I have

Norm_Table$Norm_Peak = (aggregate(data[[3]],by=list(Compound),FUN=normalization))

This is fine and I have this coding repeating 3 times just changing the data[[x]] number. Would lapply work here? or a for loop?

Axeman · Accepted Answer · 2016-01-27T16:20:33.380

3

A dplyr solution:

library(dplyr)
data %>%
  group_by(Compound) %>%
  summarize_each(funs(mean), -ExpNum)

edited Jan 27 '16 at 16:20

answered Jan 27 '16 at 16:15

Axeman

32,068
8
81
94

1

OP may want to exclude the `ExpNum` from the solution. I think you can change this to do `summarize_each(funs(mean), -ExpNum)`. – steveb Jan 27 '16 at 16:19
Most likely yes, I agree! – Axeman Jan 27 '16 at 16:20
Can i include ExpNum in the Group_by? So it does it based on two factors? If I only want columns 3-6 could I use data[,3:6] as well? – Ted Mosby Jan 27 '16 at 16:21
Yes, then you get `data %>% group_by(Compound, ExpNum) %>% summarize_each(funs(mean))` – Axeman Jan 27 '16 at 16:22
Sorry for all the of questions here, does the funs() argument not like personalized functions? and a more generic question is what exaclt is the %>% I've been seeing it a lot lately. – Ted Mosby Jan 27 '16 at 16:24
1

`summarize_each(funs(normalization))` should work if it outputs one value per vector you put in. `%>%` is a pipe operator, see `?'%>%'`. Basically `f(x, y)` is equal to `x %>% f(y)`, which can be useful when chaining many functions. – Axeman Jan 27 '16 at 16:29

aggregate over multiple columns

1 Answers1