0

I'm trying to create a Table 1 for NHANES survey data, first stratified by a binary variable for obese vs non-obese, then stratified again by a binary variable for control/trt group status ("wlp_yn"). I want to get counts (%) for categorical characteristics and means (SE) for continuous baseline characteristics. For these counts and means, I am trying to get p-values as well.

I've tried using tbl_svysummary(), svyby(), tbl_strata(), and CreateTableOne() without any success.

In the code below, I subset the full dataset into a smaller dataset of only control group data ("obese_adults") to divide up the table first. I am also starting out with age for the characteristics ("age_group" is categorical version of "RIDAGEYR" continuous variable). I couldn't figure it out, but I'm curious if there's another way to code this?

add_p_svysummary_ex1 <-
  obese_adults %>%
  tbl_svysummary(by = wlp_yn, percent = "row", include = c(age_group, RIDAGEYR), 
statistic = list(all_continuous() ~ "{mean} ({sd})")) %>%
  add_p()
add_p_svysummary_ex1
  

svyby(~RIDAGEYR, ~age_group+wlp_yn, obese_adults, svymean) # avg age of each age group

Thanks in advance! Would really appreciate any help.

Edit: This is a simplified version of the code for reproducibility

# DEMO
demo <- nhanes('DEMO')
demo_vars <- names(demo)
demo2 <- nhanesTranslate('DEMO', demo_vars, data = demo)

# PRESCRIPTION MEDICATIONS
rxq_rx <- nhanes('RXQ_RX')
rxq_rx_vars <- names(rxq_rx)
rxq_rx2 <- nhanesTranslate('RXQ_RX', rxq_rx_vars, data = rxq_rx)
rxq_rx2 <- rxq_rx2 %>% select("SEQN", "RXD240B") %>% filter(!is.na(RXD240B)) %>% group_by(SEQN) %>% dplyr::summarise(across(everything(), ~toString(na.omit(.))))

nhanesAnalysis = join_all(list(demo2, rxq_rx2), by = "SEQN", type = "full")

# Reconstructing survey weights for combining 1999-2018 - Combining ten survey cycles (twenty years) 
nhanesAnalysis$wtint20yr <- ifelse(nhanesAnalysis$SDDSRVYR %in% c(1,2), (2/10 * nhanesAnalysis$WTINT4YR), # for 1999-2002
                                          (1/10 * nhanesAnalysis$WTINT2YR)) # for 2003-2018


# sample weights 
nhanesDesign <- svydesign(id      = ~SDMVPSU,
                          strata  = ~SDMVSTRA,
                          weights = ~wtint20yr,
                          nest    = TRUE,
                          data    = nhanesAnalysis)

# subset
obese_adults  <- subset(nhanesDesign, (obesity == 1 & !is.na(BMXBMI) & RIDAGEYR >= 60))
  • 1
    Can you make your post [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? If your data is from the [NHANES](https://cran.r-project.org/web/packages/NHANES/NHANES.pdf) package, can you include your code where you load the data and subset it into `obese_adults`? – jrcalabrese Feb 16 '23 at 18:30
  • @jrcalabrese I've edited to include a simplified version of the load and subset - thanks! – happytree12 Feb 19 '23 at 14:32

1 Answers1

-1

Is this what you are looking for. A double dummy split:

library(gtsummary)
library(tidyverse)

data(mtcars)

mtcars %>%
  select(am, cyl, hp, vs) %>%
  dplyr::mutate(
    vs = factor(vs, labels = c("Obese", "Non-Obese")),
    am = factor(am, labels = c("Control", "Treatment")),
    cyl = paste(cyl, "Cylinder")
  ) %>%
  tbl_strata(
    strata = vs,
    ~.x %>%
      tbl_summary(
        by = am,
        type = where(is.numeric) ~ "continuous"
      ) %>%
      modify_header(all_stat_cols() ~ "**{level}**")
  )

enter image description here

I'm not sure why you like to use tbl_svysummary() here, it's made for survey weights.

Marco
  • 2,368
  • 6
  • 22
  • 48
  • 1
    I'm working with weighted survey data which was why I was trying tbl_svysummary -- would this show the same? I'm also struggling with how to show the means of the characteristics instead of the medians which I believe is shown here. Thanks in advance! – happytree12 Feb 19 '23 at 14:34
  • Did you try https://stackoverflow.com/questions/72079061/is-it-possible-to-create-a-stratified-table-tbl-strata-using-tbl-svysummary – Marco Feb 20 '23 at 13:06