I have a df with 2,946 obs and 600 variables.
I want to produce a table of univariate regression models for 599 variables from the dataset. To do this, I am using the tbl_uvregression()
function from the 'gtsummary' package.
Here's my code:
RAPOA_labelled[,-1] %>% #remove ID column
tbl_uvregression(
method = glm,
y = GIR.2cat, #dependent variable
method.args = list(family = binomial),
exponentiate = TRUE,
pvalue_fun = ~style_pvalue(.x, digits = 3)
) %>%
add_nevent() %>% # add number of events of the outcome
bold_p() %>% # bold p-values under a given threshold (default 0.05)
bold_labels()
Everytime it is run, I get te following error:
Error: C stack usage 7971168 is too close to the limit.
My Cstack_info()
is:
> Cstack_info()
size current direction eval_depth
7969177 12800 1 2
EDIT
As final output, I needed a table with the estimate, std.error, pvalue, odds ratio and confident interval for each variable in the data frame. tbl_regression did not works fine for me, so, finally I do it with a loop.
I’ll leave the code here in case it serves anyone.
name <- colnames(datos_rapoa_gir[,-c(1:2)]) # to remove ID and outcome columns
term <- {}
B <- {}
SE <- {}
pvalue <- {}
OR <- {}
lowIC <- {}
highIC <- {}
for (i in seq_along(name)) {
mod_formula <- as.formula(sprintf("GIR.2cat ~ %s", name[i]))
mod <- glm(formula = mod_formula, family = "binomial", data = datos_rapoa_gir, na.action = na.omit)
term <- c(term, broom::tidy(mod)$term)
B <- c(B, broom::tidy(mod)$estimate)
SE <- c(SE, broom::tidy(mod)$std.error)
pvalue <- c(pvalue, broom::tidy(mod)$p.value)
OR <- c(OR, exp(mod$coefficients))
lowIC <- c(lowIC, exp(confint(mod))[,1])
highIC <- c(highIC, exp(confint(mod))[,2])
}
univars <- data.frame(variable = term, B = B, SE = SE, pvalue = pvalue, OR = OR, LowIC = lowIC, HighIC = highIC) %>%
remove_rownames()