I was trying out linear regression and observe that I get this error in spite of all my factor columns having at least two levels.
I tracked down to the column which is giving me this error and this is the summary of that column
> summary(df[,30])
0 1 <NA>
31543 14 0
> unique(df[,30])
[1] 0 1
Levels: 0 1 <NA>
I have also eliminated all rows which have an NA value by doing the following
df = na.omit(df)
Please note that the NA above is an additional factor level I have added using the addNA function.
How do I resolve this?
EDIT : I have placed a reproducible example at my public share on http://aftabubuntu.cloudapp.net/ . Please download the reproduce.RDS file from here.
This is the code I'm using
df = readRDS('reproduce.RDS')
model = lm(formula = COL_101~.,data=traindf)
predict.lm(model, df[1:5,])
This is my output
> model = lm(formula = COL_101~.,data=df)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels