How can I evaluate strings each row using dplyr::mutate?
I am 2-month newbie to R. I am practicing tidyverse
to manipulate data and run statistics.
I am trying to run multiple linear regressions and get p-values of variables per each regression.
Here are reproducible samples;
require(tidyverse)
df <-
tibble(serialNO = seq(1,10,1),
lactate = c(1.3, 1.6, 2.6, 3.5, 1.2, 1.1, 3.6, 3, 1.9, 5.3),
BMI = c(20, 27, 23, 25, 23, 23, 20, 24, 19, 23),
Afib = c(0, 0, 1, 0, 0, 0, 1, 0, 0, 0),
LVEF = c(65, 68, 61, 58, 57, 58, 25, 59, 66, 58))
# A tibble: 10 x 5
serialNO lactate BMI Afib LVEF
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1.3 20 0 65
2 2 1.6 27 0 68
3 3 2.6 23 1 61
4 4 3.5 25 0 58
5 5 1.2 23 0 57
6 6 1.1 23 0 58
7 7 3.6 20 1 25
8 8 3 24 0 59
9 9 1.9 19 0 66
10 10 5.3 23 0 58
Codes for multinomial linear regression are stored as strings each row, which looks like;
reg_com <- c("lm(lactate~sex+BMI+Afib, data=df)",
"lm(lactate~sex+BMI+LVEF, data=df)",
"lm(lactate~sex+Afib+LVEF, data=df)",
"lm(lactate~BMI+Afib+LVEF, data=df)")
# A tibble: 4 x 1
reg
<chr>
1 lm(lactate~sex+BMI+Afib, data=df)
2 lm(lactate~sex+BMI+LVEF, data=df)
3 lm(lactate~sex+Afib+LVEF, data=df)
4 lm(lactate~BMI+Afib+LVEF, data=df)
What I want for result looks like this.
# A tibble: 4 x 5
reg sex BMI Afib LVEF
<chr> <chr> <chr> <chr> <chr>
1 lm(lactate~sex+BMI+Afib, data=df) p p p NA
2 lm(lactate~sex+BMI+LVEF, data=df) p p NA p
3 lm(lactate~sex+Afib+LVEF, data=df) p NA p p
4 lm(lactate~BMI+Afib+LVEF, data=df) NA p p p
p
in tibble are p-values of variables for each linear regression.
Since I spent the entire 2 days, I tried using 'for loop' , and I am getting error messages
reg_sum <- tibble(reg = as.character())
for(i in 1:length(reg_com)) {
a <-
df %>%
print(eval(parse(text=paste0(",reg_com[i],")))) %>%
tidy %>%
select(term, p.value) %>%
column_to_rownames(var = "term") %>% # prepare for transpose
t %>%
as_tibble %>%
mutate(reg = reg_com[i])
reg_sum <- full_join(reg_sum, a)
}
error: C stack usage 15923360 is too close to the limit
I am trying to do this because I need to perform more than 10k combinations of linear regressions.
I want to do it using dplyr if possible. (It's so cool!)
Please help me!