0

My dataset has a dummy variable which divides the data set into two groups. I would like to display the descriptive statistics for both next to each other, like:

example

using stargazer. Is this possible?

For example, if there is the mtcars data set and the variable $am divides the dataset into two groups, how can I display the one group on the left side and the other group on the other side?

Thank you!

I was able to display the two statistics below each other (I had to make two separate datasets for each group), but never next to each other.

treated <- mtcars[mtcars$am == 1,]
control <- mtcars[mtcars$am == 0,]

stargazer(treated, control, keep=c("mpg", "cyl", "disp", "hp"), 
          header=FALSE, title="Descriptive statistics", digits=1, type="text")

Descriptive statistics below each other

  • Welcome to SO. Can you make your post [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by providing your data (or example data) to go with your code? – jrcalabrese Nov 28 '22 at 17:19
  • I have edited my post and added a reproducible example. – loveafter8 Nov 29 '22 at 08:29

2 Answers2

2

Someone should point out if I'm mistaken, but I don't believe that stargazer will allow for the kind of nested tables you are looking for. However, there are other packages like modelsummary, gtsummary, and flextable that can produce tables similar to stargazer. I have included examples below using select mtcars variables summarized by am. Personally, I prefer gtsummary due to its flexibility.

library(tidyverse)
data(mtcars)

### modelsummary
# not great since it treats `cyl` as a continuous variable
# https://vincentarelbundock.github.io/modelsummary/articles/datasummary.html
library(modelsummary)
datasummary_balance(~am, data = mtcars, dinm = FALSE)

### gtsummary
# based on example 3 from here
# https://www.danieldsjoberg.com/gtsummary/reference/add_stat_label.html
library(gtsummary)
mtcars %>%
  select(am, mpg, cyl, disp, hp) %>%
  tbl_summary(
    by = am, 
    missing = "no",
    type = list(mpg ~ 'continuous2',
                cyl ~ 'categorical',
                disp ~ 'continuous2',
                hp ~ 'continuous2'),
    statistic = all_continuous2() ~ c("{mean} ({sd})", "{median}")
    ) %>%
  add_stat_label(label = c(mpg, disp, hp) ~ c("Mean (SD)", "Median")) %>%
  modify_footnote(everything() ~ NA)

### flextable
# this function only works on continuous vars, so I removed `cyl`
# https://davidgohel.github.io/flextable/reference/continuous_summary.html
library(flextable)
mtcars %>%
  select(am, mpg, cyl, disp, hp) %>%
  continuous_summary(
    by = "am",
    hide_grouplabel = FALSE,
    digits = 3
  )
jrcalabrese
  • 2,184
  • 3
  • 10
  • 30
0

You can use the modelsummary package and its datasummary function, which offers a formula-based language to describe the specific table you need. (Disclaimer: I am the maintainer.)

In addition to the super flexible datasummary function, there are many other functions to summarize data in easier ways. See in particular the datasummary_balance() function here:

https://vincentarelbundock.github.io/modelsummary/articles/datasummary.html

library(modelsummary)
dat <- mtcars[, c("mpg", "cyl", "disp", "hp", "am")]
datasummary(
    All(dat) ~ Factor(am) * (N + Mean + SD + Min + Max),
    data = dat)

enter image description here

Vincent
  • 15,809
  • 7
  • 37
  • 39