1

I have a dataset with three variable a b and c.

a 45 345
a 45 345
a 34 234
a 35 456
b 45 123
b 65 345
b 34 456
c 23 455
c 54 567
c 34 345
c 87 567
c 67 345

I want to aggregate the data set by a and b and give count and mean. Please find the below output. Is there any function to do both together.

A   B  numobs   c
a   34  1      234  
a   35  1      456  
a   45  2      345  
b   34  1      456  
b   45  1      123  
b   65  1      345  
c   23  1      455  
c   34  1      345  
c   54  1      567  
c   67  1      345  
c   87  1      567

numobs is the count and c is the mean value

Hack-R
  • 22,422
  • 14
  • 75
  • 131
suresh
  • 59
  • 5
  • Possible duplicate of [R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate](http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega) – Hack-R Jun 19 '16 at 13:43

1 Answers1

2

We can use dplyr

library(dplyr)
df1 %>%
   group_by(A, B) %>%
   mutate(numbobs =n(), C= mean(C))  

Or with data.table

library(data.table)
setDT(df1)[, c("numbobs", "C") := .(.N, mean(C)) , by = .(A, B)]
akrun
  • 874,273
  • 37
  • 540
  • 662