0

I have success creating a table of means and standard deviations, but now I need to separate the variables results by two groups, in this case by gender.

cols<-c('edad','experiencia','indigena','mestizo','blanco','años_educ')

stargazer(base[which(base$año=="2009"),][, cols], type = "text", 
   summary.stat = c("min", "p25", "median", "p75", "max", "mean", "sd"))

This code has allowed me to create this well formatted table, it also contain other stats, but now I want to dived the results by groups with more or less the same code. How can I achieve this?

============================================================
Statistic   Min Pctl(25) Median Pctl(75) Max  Mean  St. Dev.
------------------------------------------------------------ 
edad         0     13      25      47    99  30.701  21.997 
experiencia  0     2       8       20    80  12.924  14.222 
indigena     0     0       0       0      1  0.080   0.271  
mestizo      0     1       1       1      1  0.814   0.389  
blanco       0     0       0       0      1  0.053   0.224  
años_educ    0     5       7       12    21  8.423   4.563  
------------------------------------------------------------
  • 2
    Welcome to Stack Overflow! Could you make your problem reproducible by sharing a sample of your data so others can help (please do not use `str()`, `head()` or screenshot)? You can use the [`reprex`](https://reprex.tidyverse.org/articles/articles/magic-reprex.html) and [`datapasta`](https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html) packages to assist you with that. See also [Help me Help you](https://speakerdeck.com/jennybc/reprex-help-me-help-you?slide=5) & [How to make a great R reproducible example?](https://stackoverflow.com/q/5963269) – Tung Oct 10 '18 at 23:54

1 Answers1

0

The best way about such operations is probably using dplyr.

# install.packages("dplyr")
library(dplyr)

data <- tibble(
  grp = rep(c("M", "F"), 5), # gender column
  value = runif(10, 5, 10)
)

data %>% 
    group_by(grp) %>% # our group
    summarise( # summarise operation by group
        mean = mean(value),
        std = sd(value)
    )
JohnCoene
  • 2,107
  • 1
  • 14
  • 31