I am calculating my first regression in R and I am running into what seems to be a typical error: "variable lengths differ".
After doing some testing, I found that this error occured because I used vectors inside lm that contain my variable names, rather than specifying the variable names directly. So I would have something like:
control_vars <- c("age","gender")
lm(dependent_var ~ control_vars, data = fictive_data)
This throws an error. Changing it to the following solves the issue:
lm(dependent_var ~ age + gender, data = fictive_data)
So is there any way to use name vectors inside regression models in R while avoiding this error?
Below is a reproducible example of my issue.
Thanks!
tdf <- data.frame(
a=c(1,4,3,5,3,3),
b=c(1,2,4,6,2,2),
c=c(1,3,6,3,2,1)
)
#this works
test_model0 <- lm(a ~ b + c,
data= tdf)
#this doesn't
iv_1 <- c("b", "c")
test_model1 <- lm(a ~ iv_1,
data= tdf)
#neither does this
iv_2 <- "b + c"
test_model2<- lm(a ~ iv_2,
data= tdf)