I know there is a lot on this topic, but I still could not figure it out myself. I have a dataset with various dependent (here: flux) and independent/ explanatory variables (expl). My goal is to iterate through all dependent with each explanatory variable using different models. In the end, I thought to pipe it through purr::glance(), so I could filter for significant p-values and compare the BICs of models.
Here comes a part of my dataset:
library(tidyverse)
library(broom)
dat <- structure(list(management = structure(c(1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "c"
), class = "factor"), flux1 = c(61.51, 73.3, 66.74, 117.75, 74.69,
74.12, 67.28, 78.67, 63.11, 82.74, 101.46, 80.48, 81.47, 56.76,
74.34), flux2 = c(0.72, 8.93, -0.2, 3.9, 0.05, 1.39, -0.69, 2.19,
3.31, 1.43, 8.57, 2.42, 2.56, 1.91, 10.41), flux3 = c(-52.97,
-59, -52.75, -44.95, -52.96, -62.68, -73.13, -120.33, -82.07,
-53.11, -36.01, -75.93, -84.01, -80.23, -127.61), expl1 = c(32.98,
76.13, 70.19, 106.29, 118.66, 102.61, 70.77, 45.93, 74.63, 90.48,
43.34, 98.7, 92.5, 73.68, 83.62), expl2 = c(8.9, 9, 9.03, 10.03,
9.75, 9.85, 9.15, 9.28, 8.97, 9.25, 9.72, 9.32, 9.65, 8.82, 9.1
), expl3 = c(10.12, 11.62, 11.18, 12.9, 14.05, 13.93, 10.8, 10.85,
10.62, 11.97, 11.03, 7.38, 11.55, 12.1, 7.9)), row.names = c(NA,
-15L), class = "data.frame")
So here is what I could get to work so far, inspired by this post:
a <- dat %>%
gather(key = "expl", value = "expl_value", c(expl1:expl3)) %>%
group_split(expl) %>%
map_df(~lm(log(flux1) ~ expl_value, data = .x) %>%
glance() %>%
mutate(expl_group = unique(.x$expl)))
I could get it to iterate through the explanatory variables but not the dependent variables. And, I could only make it work for one model. I further made a list of models, like they did here and tried to map through that list but could not get it to work:
models <- function(.x) {
linear <- lm(flux1 ~ expl_value, data = .x)
exponential <- lm(log(flux1 ) ~ expl_value, data = .x)
logarithmic <- lm(flux1 ~ log(expl_value), data = .x)
power <- lm(log(flux1) ~ log(expl_value), data = .x)
lst(linear, exponential, logarithmic, power)
}
Actually, I also want to expand that list of models by adding + management
as a factor, but I can figure that out later. I would appreciate your help and am also open to other suggestions on how to try out various different models and compare their BICs.
Edit: As Stefano pointed out my fluxes can include also negative values (not only flux3). So I would need something like if flux <= 0, then add min(flux) before log-transformation.
Cheers, Anike