On my dummy model below I use the stepAIC
in forward
direction to select my predictives variables or interactions.
Is there a way to make sure we cannot select both a variable and one of its interaction or the other way around ?
For example :
- If the process select first the interaction
carat.fact:cut
then I don't want the variablescarat.fact
andcut
to be picked up in the model afterward. - Or if the first pick is the variable
carat.fact
then I don't want any of its interaction to be selected afterward.
My dummy glm :
library(ggplot2)
library(MASS)
# Var to predict
diamonds$price.8k <- diamonds$price > 8000
# Cut the carat & depth
diamonds$carat.fact <- cut(diamonds$carat, breaks = c(-Inf, quantile(diamonds$carat, probs = c(0.3,0.7)), Inf))
diamonds$depth.fact <- cut(diamonds$depth, breaks = c(-Inf, quantile(diamonds$depth, probs = c(0.25, 0.75)), Inf))
# Select only vars
diamonds <- diamonds %>% select(price.8k, cut, carat.fact, depth.fact)
# Conduct a logistic regression on the new binary variable
mod <- glm(formula = price.8k ~ 1, data = diamonds, family = binomial())
mod <- MASS::stepAIC(mod, direction = "forward",
scope = list(upper = price.8k ~ (cut + carat.fact + depth.fact)^2,
lower = ~1))