I get this error depending on which variables I include and the sequence in which I specify them in the formula:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
I've done a little research on this and it looks like it would be caused by the variable in question not being a factor variable. In this case (is_women_owned), it is a factor variable with 2 levels ("Yes", "No").
> levels(customer_accounts$is_women_owned)
[1] "No" "Yes"
No error:
f1 <- lm(combined_sales ~ is_women_owned, data=customer_accounts)
No error:
f2 <- lm(combined_sales ~ total_assets + market_value + total_empl + empl_growth + sic + city + revenue_growth + revenue + net_income + income_growth, data=customer_accounts)
Regressing on the above formula plus the factor variable "is_women_owned":
f3 <- lm(combined_sales ~ total_assets + market_value + total_empl + empl_growth + sic + city + revenue_growth + revenue + net_income + income_growth + is_women_owned, data=customer_accounts)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
I get the same error when applying stepwise linear regression, as you would expect.
This seems like a bug, it should give us a model where "is_women_owned" perhaps offers no additional explanatory value because it is highly correlated to the other variables, not error out like this.
I verified that there is no missing data for this variable, too:
> which(is.na(customer_accounts$is_women_owned))
integer(0)
Also, there are two values present in the factor variable:
customer_accounts$is_women_owned[1:20]
[1] No No No No No No No No No No No No No No Yes No
[17] No No No No
Levels: No Yes