0

I am programming in R, and while performing logistic regression I am getting this error:

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

That's the code which I am using, I have checked all my factor and no one has less than two levels.

c1<-"campaign_type"
c2<-"campaign_status"
c3<-"advertiser_cost"
output.var<-"Success"

names(train)
 [1] "campaign_type"         "campaign_status"       "connection_type"       "cpa_price"             "impressions"          
 [6] "clicks"                "conversions"           "advertiser_cost"       "cpa_revenue"           "profit"               
[11] "revenue_ecpm"          "cost_ecpm"             "profit_ecpm"           "ctr"                   "conversion_rate"      
[16] "click_conversion_rate" "margin"                "manager"     
      "sales_manager"         "Success"    

 > glm(output.var~c1+c2+c3,family=binomial('logit'),data=train)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels



> class(train[,c1])
[1] "factor"
> unique(train[,c1])
[1] CPA Optimized_CPM CPM          
Levels: CPA CPM Optimized_CPM
> class(train[,c2])
[1] "factor"
> unique(train[,c2])
[1] Launched Paused  
Levels: Launched Paused
> class(train[,c3])
[1] "numeric"
> class(train[,output.var])
[1] "integer"
> unique(train[,output.var])
[1] 0 1  

As I said all my factors have 2 or more levels.

Can anyone tell me why I am getting this error?

thats a link address for the data: https://drive.google.com/file/d/0B-s59D9jsTcnVVppSlNQVE5PMGM/view?usp=sharing

Thanks

Basel.D
  • 349
  • 1
  • 18
  • 2
    Without reproducible data, my first guess is that your larger dataframe has factors with both levels, but your training (subset) data.frame does not. Try `droplevels(train)` and then check your factors. – lmo Jan 10 '17 at 12:56
  • 1
    its because `model.matrix` is erroring out. Because, the population from where you took the `train` data had more levels but now this doesn't have that many – joel.wilson Jan 10 '17 at 12:56
  • 1
    basically lets say you had a `factor/character` variable with 3 levels `a1,a2,a3`. Now you subsetted from this to form `train` but train's version of this variable had just `a1`. So model.matrix errors out inside `glm()` – joel.wilson Jan 10 '17 at 12:59
  • 1
    you can either follow what @Imo suggested else, set the `levels` of this variable to contain all 3 `a1,a2,a3` inside `train` – joel.wilson Jan 10 '17 at 13:00
  • Do you have any `NA` values? We really need a reproducible example ... – Ben Bolker Jan 10 '17 at 13:25
  • @lmo @ joel.wilson thanks for answering, i have tried droplevels(train) and it gives the same answer: > Train<-droplevels(train) > unique(Train[,c1]) [1] CPA Optimized_CPM CPM Levels: CPA CPM Optimized_CPM > unique(Train[,c2]) [1] Launched Paused Levels: Launched Paused – Basel.D Jan 10 '17 at 13:27
  • No i have no NA in my data – Basel.D Jan 10 '17 at 13:35
  • that's a link address for the data https://drive.google.com/file/d/0B-s59D9jsTcnVVppSlNQVE5PMGM/view?usp=sharing – Basel.D Jan 10 '17 at 13:55
  • 1
    thanks guys for the answers, this solves the problem: loop.vars<-c(char1,char2,char3) myform<-as.formula(paste(as.symbol(output.var), " ~ ", paste(loop.vars, collapse= "+"))) glm(myform,family=binomial('logit'),data=train) – Basel.D Jan 10 '17 at 14:15
  • Please try to format your code better next time and reward others for their answers or post your own answer if you solved the question by yourself :) – pat-s Jan 12 '17 at 20:11

0 Answers0