1

I have a dataset that has a range of IDs and activities, and a bunch of columns of observations for each combination of ID and activity. I'd like to take the average of each observation, but since there's hundreds and hundreds of observations, I'm unclear how to proceed.

Example data:

id,activity,obs1,obs2,obs3
1,1,325,6432,5432
1,2,321,214,2143
1,3,3652,123,123
2,1,5321,123,643
2,2,4312,4321,432
2,3,522,123,321
1,1,532,765,8976
1,2,142,865,5445
1,3,643,654,53
2,1,756,765,7865
2,2,876,654,976
2,3,6754,765,987

What I've tried so far:

library(dplyr)
example <- read.table("clipboard",sep=",",header=T)
group <- group_by(example,id,activity)
summarize(group, mobs1=mean(obs1), mobs2=mean(obs2), mobs3=mean(obs3))

Which gets me the right form, but how can I go about the summarize() without typing mobsN=mean(obsN) hundreds of times? I feel like an apply function will go in here but I'm not sure which...

Jaap
  • 81,064
  • 34
  • 182
  • 193
AI52487963
  • 1,253
  • 2
  • 17
  • 36

1 Answers1

3

This should give you the desired result:

library(dplyr)
means.wide <- example %>% 
  group_by(id,activity) %>% 
  summarise_each(funs(mean))

You could also convert example to long format and then calculate the means:

library(dplyr)
library(tidyr)

means.long <- example %>% 
  gather(obs, val, -c(id,activity)) %>% 
  group_by(id,activity,obs) %>% 
  summarise(mean_val=mean(val))

You could also do this with the data.table package:

# compareble to the wide dplyr version
library(data.table)
setDT(example)[, lapply(.SD, mean), by=list(id,activity)]

# compareble to the long dplyr version
library(data.table)
melt(setDT(example),id.vars=c("id","activity"))[, mean(value), by=list(id,activity,variable)]

And don't forget about good old base R:

aggregate(. ~ id + activity, example, FUN = mean)
Jaap
  • 81,064
  • 34
  • 182
  • 193
  • If you've used `gather`, you should just be left with three columns (id, obs and val) so can't you just use `summarise(mean_val = mean(val))`? Or you could use `summarise_each` without using `gather` first. – Nick Kennedy Jul 22 '15 at 19:33
  • @NickKennedy you're correct, I made a mistake; see the updated answer – Jaap Jul 22 '15 at 19:40