0

Here I would like to summary the data by min, max, and mean.

set.seed(55775)
x <- round(runif(150000,1,1000),2)
g <- sample(LETTERS[1:4],150000,replace=T)

I know tapply can do the summary, tapply(x,g,summary), and it will give the same answer as the following table, but I don't know how to generate this neat table instead of using tapply...

g   MIN    MAX     MEAN
A  1.06  999.94  500.5395
B  1.01  999.95  501.6863
C  1.01  999.99  503.8504
D  1.05  999.97  500.5327
user2884661
  • 55
  • 1
  • 1
  • 3
  • 3
    Please [search SO and I am sure you will find nice answers you can adapt to your own data](http://stackoverflow.com/search?q=[r]+summary+statistics+per+group). [This is also a good starting point](http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega). Cheers. – Henrik Oct 30 '13 at 20:49
  • 2
    [Here's another.](http://stackoverflow.com/questions/7449198/quick-elegant-way-to-construct-mean-variance-summary-table) – gung - Reinstate Monica Oct 30 '13 at 22:17

2 Answers2

3

Since tapply returns a list in this case, you can just use do.call(rbind, ...) and extract the columns you are interested in:

do.call(rbind, tapply(x, g, summary))[, c("Min.", "Max.", "Mean")]
#   Min.   Max.  Mean
# A 1.06  999.9 500.5
# B 1.01 1000.0 501.7
# C 1.01 1000.0 503.9
# D 1.05 1000.0 500.5
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
0

You're almost there...

> t1 <- tapply(x, g, summary)
### sapply is basically loop over 4x items in list `t1` to extract values
### then t() to transpose to fit your example
> t2 <- t( sapply(1:nrow(t1), function (i) t1[[i]][c("Min.", "Max.", "Mean")]) )
### rename per your example:
> rownames(t2) <-  names(t1)
> colnames(t2) <- c(" MIN", " MAX", " MEAN")

giving:

> t2
   MIN    MAX  MEAN
A 1.06  999.9 500.5
B 1.01 1000.0 501.7
C 1.01 1000.0 503.9
D 1.05 1000.0 500.5

See ?format if you want to fine-tune the presentation any further.

dardisco
  • 5,086
  • 2
  • 39
  • 54