0

I have a df with 2,946 obs and 600 variables.

I want to produce a table of univariate regression models for 599 variables from the dataset. To do this, I am using the tbl_uvregression() function from the 'gtsummary' package.

Here's my code:

RAPOA_labelled[,-1] %>%    #remove ID column
  tbl_uvregression(
    method = glm,
    y = GIR.2cat,          #dependent variable
    method.args = list(family = binomial),
    exponentiate = TRUE,
    pvalue_fun = ~style_pvalue(.x, digits = 3)
  ) %>%
  add_nevent() %>%         # add number of events of the outcome
  bold_p() %>%             # bold p-values under a given threshold (default 0.05)
  bold_labels()

Everytime it is run, I get te following error:

Error: C stack usage 7971168 is too close to the limit.

My Cstack_info() is:

> Cstack_info()
   size    current   direction eval_depth 
7969177      12800           1          2 

EDIT

As final output, I needed a table with the estimate, std.error, pvalue, odds ratio and confident interval for each variable in the data frame. tbl_regression did not works fine for me, so, finally I do it with a loop.

I’ll leave the code here in case it serves anyone.

name <- colnames(datos_rapoa_gir[,-c(1:2)]) # to remove ID and outcome columns

term <- {}
B <- {}
SE <- {}
pvalue <- {}
OR <- {}
lowIC <- {}
highIC <- {}

for (i in seq_along(name)) {
  mod_formula <- as.formula(sprintf("GIR.2cat ~ %s", name[i]))
  mod <- glm(formula = mod_formula, family = "binomial", data = datos_rapoa_gir, na.action = na.omit)
  
  term <- c(term, broom::tidy(mod)$term)
  B <- c(B, broom::tidy(mod)$estimate)
  SE <- c(SE, broom::tidy(mod)$std.error)
  pvalue <- c(pvalue, broom::tidy(mod)$p.value)
  OR <- c(OR, exp(mod$coefficients))
  lowIC <- c(lowIC, exp(confint(mod))[,1])
  highIC <- c(highIC, exp(confint(mod))[,2])
}

univars <- data.frame(variable = term, B = B, SE = SE, pvalue = pvalue, OR = OR, LowIC = lowIC, HighIC = highIC) %>%
  remove_rownames()
Vinícius Félix
  • 8,448
  • 6
  • 16
  • 32
mcamenc
  • 1
  • 1

1 Answers1

0

A tbl_uvregression() object can become large (containing the full data, the model object, etc.), and it looks like your machine's memory can't handle it.

What do you want the output to look like? It sounds like you're going to end up with a table with 600 rows (likely more if you have categorical covariates).

Here are some steps you can take to reduce the size: The tbl_uvregression() object is a helper function that iterates over the columns in a data frame, builds a model for each column, calls tbl_regression() on each model, and combines all the tables with tbl_stack(). Rather than using the helper function, follow these steps yourself to reduce the size. After you call tbl_regression(), you can further reduce the size of the tbl_regression() object using the tbl_butcher().

Happy Programming!

Daniel D. Sjoberg
  • 8,820
  • 2
  • 12
  • 28
  • Hello Daniel, I have a similar problem with ```tbl_uvregression```: ```Error: C stack usage 15923440 is too close to the limit```. So I run two separate ```tbl_uvregression``` (with half dataset in each) which perform without problem. When I try to stack (```tbl_stack(list(t1, t2))```) or merge (```tbl_merge(list(t1, t2))```), the C stack problem is back. Any clues? – B_slash_ Mar 03 '22 at 09:04
  • Did you use `tbl_butcher()` on the tbls before you tried stacking? Otherwise, i don't have a solution for you. – Daniel D. Sjoberg Mar 03 '22 at 12:49
  • Thanks for the suggestion: both tables are well reduced (124 to 45 Mb (for 54 lines), and 105 to 39 Mb (or 53 lines)) after butchering but it is always to much C stack usage to use ```tbl_stack``` (or ```tbl_merge```)... I don't know why... can I expand C stack ? – B_slash_ Mar 03 '22 at 16:30
  • After applying tbl_butcher() to my tbl_summary() and tbl_uvregression() objects, they are well reduced. However, when I then attempt to tbl_merge() the two objects, I receive this error: `Error in UseMethod("slice") : no applicable method for 'slice' applied to an object of class "NULL"` – johnckane Jun 23 '22 at 04:24