1

Suppose i have these data

glucose=
structure(list(GR = c(1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 
2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L), glucose.1 = c(5.5, 4.77, 
5.52, 4.97, 4.4, 5.54, 4.85, 5.5, 5.5, 5.5, 5.09, 5.51, 5.5, 
5.5, 5.58, 5.58, 4.65, 5.5, 4.46, 5.43), glucose.2 = c(5.56, 
5.58, 5.58, 5.51, 5.5, 5.58, 5.5, 5.5, 5.52, 5.5, 5.49, 5.51, 
5.51, 5.56, 5.56, 5.5, 5.5, 5.58, 5.51, 5.53), glucose.3 = c(5.56, 
5.58, 5.58, 5.54, 5.57, 5.54, 5.53, 5.56, 5.51, 5.57, 5.54, 5.54, 
5.2, 5.26, 5.54, 5.55, 5.57, 5.25, 5.56, 5.54), glucose.4 = c(5.51, 
5.51, 5.53, 5.54, 5.52, 5.5, 5.51, 5.54, 4.99, 5.53, 5.51, 5.52, 
5.57, 5.54, 5.51, 5.58, 5.28, 5.51, 5.54, 5.54), glucose.5 = c(5.3, 
5.2, 5.51, 5.51, 5.51, 5.51, 5.51, 5.51, 5.4, 5.3, 5.51, 5.5, 
5.51, 5.55, 5.51, 5.51, 5.52, 5.51, 5.1, 5.42), glucose.6 = c(5.1, 
5.5, 5.45, 5.52, 5.32, 5.51, 5.45, 5.32, 5.57, 5.41, 5.54, 4.86, 
5.12, 5.54, 5.58, 5.32, 5.52, 5.04, 5.1, 5.5)), class = "data.frame", row.names = c(NA, 
-20L))

; to calculate descriptive statistics i can use such way

library(psych)
> describeBy(glucose.1 ~ GR,data=a)

result
 Descriptive statistics by group 
GR: 1
   vars  n mean   sd median trimmed  mad min  max range  skew
X1    1 10 5.32 0.42    5.5     5.4 0.07 4.4 5.58  1.18 -1.31
   kurtosis   se
X1    -0.14 0.13
------------------------------------------------ 
GR: 2
   vars  n mean   sd median trimmed  mad  min  max range  skew
X1    1 10 5.17 0.39    5.3    5.21 0.33 4.46 5.54  1.08 -0.43
   kurtosis   se
X1     -1.5 0.12

But it means that i must do this command for each variable , but i need for all variables at once, because it can be big count of variables. and the second this command describeBy provides many unnecessary statistics, such as trim and so on , but does not give those statistics that are needed, for example, the coefficient of variation (standard deviation divided by the mean in in percentage terms %)

So this is a question I really need help with. How to calculate these statistics separately for each group for all variables

count of obs
Mean
Median
Minimum
Maximum
25 percentile
75 percentile
Stdev
coef variation( %)

so that needed for me output was something like this(output was made manually as example. it was made not on my dput() but its not important, i just need such structure of output ) enter image description here

GR  glucose 1   glucose 2   glucose 3   glucose 4   glucose 5
gr=1    count of obs    33  33  31  31
(N = 33)    Mean    26,36   30,27   26,55   28,48
    Median  24  24  22  22
    Minimum 10  10  11  11
    Maximum 48  173 73  94
    25 percentile   32  35  33  30,5
    75 percentile   20  20  19  18,5
    Stdev   9,71    27,56   13,1    18,29
    coef variation( %)  36,82   91,03   49,36   64,2
gr=2    count of obs    33  33  32  32
(N = 33)    Mean    23,85   29,21   23,34   25,34
    Median  24  22  22  20,5
    Minimum 11  11  11  10
    Maximum 41  152 49  76
    25 percentile   31  32  28  31,25
    75 percentile   17  20  16,75   14,75
    Stdev   8,81    24,05   9,31    15,07
    coef variation( %)  36,95   82,34   39,88   59,45

How get such structure of output? Thank you for your help.

psysky
  • 3,037
  • 5
  • 28
  • 64
  • 2
    Does this answer your question? [How to get summary statistics by group](https://stackoverflow.com/questions/9847054/how-to-get-summary-statistics-by-group) – thehand0 Jan 16 '22 at 10:58
  • @Djoe I get what you are trying to do. if you want the summary parameters in rows rather than columns, you do something like this mysummary<-function(df){ grouping=df[1,1] %>% pull() mean=df %>% summarize(across(contains("glucose"),~mean(.x))) %>% mutate(parameter="mean") median=df %>% summarize(across(contains("glucose"),~median(.x))) %>% mutate(parameter="median") bind_rows(mean,median ) %>% mutate(group=grouping) %>% relocate(group,parameter, .before=glucose.1) } df_summary<-glucose %>% group_split(GR) %>% map(mysummary) %>% bind_rows() – Joe Erinjeri Jan 16 '22 at 15:25
  • @JoeErinjeri yes it helps, but i need the statisctics that i mention, (count of rows, min,max,25,75 percentiles, SD,Coef of variation) How can i do it? – psysky Jan 17 '22 at 09:15
  • @thehand0,no, cause it doesn't provided needed statistics and format – psysky Jan 17 '22 at 10:48
  • Hi @D.Joe, I would be happy to help you but unfortunately the question has been closed and I do not have the privileges to reopen it... So, let's wait for someone to open it again. Cheers. – lovalery Jan 17 '22 at 12:31

0 Answers0