2

I want to modify the summary output returned from summary function (base r):

summary(mtcars)

This shows standard summary stat:

   am              gear            carb      
 Min.   :0.0000   Min.   :3.000   Min.   :1.000  
 1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
 Median :0.0000   Median :4.000   Median :2.000  
 Mean   :0.4062   Mean   :3.688   Mean   :2.812  
 3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
 Max.   :1.0000   Max.   :5.000   Max.   :8.000

Can I modify this output by removing Median and adding count and stdev? Thanks.

jay.sf
  • 60,139
  • 8
  • 53
  • 110

1 Answers1

2

Make a DIY summary using sapply.

sapply(mtcars[c("am", "gear", "carb")], function(x)
  c(min=min(x), quantile(x, c(.25, .75)), max=max(x), count=length(x), sd=sd(x)))
#               am       gear    carb
# min    0.0000000  3.0000000  1.0000
# 25%    0.0000000  3.0000000  2.0000
# 75%    1.0000000  4.0000000  4.0000
# max    1.0000000  5.0000000  8.0000
# count 32.0000000 32.0000000 32.0000
# sd     0.4989909  0.7378041  1.6152

Alternatively you may use lapply to customize columns by rounding etc.

do.call(rbind, lapply(mtcars[c("am", "gear", "carb")], function(x)
  data.frame(min=min(x), q1=quantile(x, .25), q3=quantile(x, .75), max=max(x), 
             count=length(x), sd=round(sd(x), 3))))
#      min q1 q3 max count    sd
# am     0  0  1   1    32 0.499
# gear   3  3  4   5    32 0.738
# carb   1  2  4   8    32 1.615
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • How to control decimals for counts, individually? – user7256821 Jan 30 '21 at 08:46
  • @user7256821 Since the result is a matrix that's only possible globally by wrapping a `round(..., 4)` overall. May I ask for what purpose you need counts as integers? – jay.sf Jan 30 '21 at 08:49
  • I am using this summary stat on the ggplot2 output, will display summary as visual table with gridExtra package. – user7256821 Jan 30 '21 at 09:39
  • @user7256821 Ok, I suggest to ask in a new question how to pass that to `ggplot`. Be sure to make a [specific example](https://stackoverflow.com/a/5963610/6574038). – jay.sf Jan 30 '21 at 10:30
  • Will do that, but is it not possible to modify standard summary output? – user7256821 Jan 30 '21 at 10:53
  • @user7256821 The thing is that `summary` calls a [method](https://astrostatistics.psu.edu/su07/R/html/methods/html/Methods.html), in this case `base:::summary.data.frame`. I think it is overkill to rewrite a method just to feed `ggplot` with specific data. – jay.sf Jan 30 '21 at 11:00
  • 1
    @user7256821 Try `do.call(rbind, lapply(mtcars[c("am", "gear", "carb")], function(x) data.frame(min=min(x), q1=quantile(x, .25), q3=quantile(x, .75), max=max(x), count=length(x), sd=sd(x)))`, which gives you count as integer. – jay.sf Jan 30 '21 at 11:05
  • I clarify the need, I wanted to control decimals individually, like it does not make sense to show count as 5 decimal point, so I wanted to without decimal and say stdev upto 3 decimals. – user7256821 Jan 30 '21 at 11:14
  • 1
    @user7256821 Just customize code of last comment, e.g. `sd=round(sd(x), 3)` – jay.sf Jan 30 '21 at 11:17
  • You can copy your above final code in your main reply. It finally served my needs, even I do not need to ask new question. – user7256821 Jan 30 '21 at 11:56