2

I am working on a dataset where my target variable CLASS has three categorical values. HIGH,LOW AND MEDIUM

Now When I apply Ordinal Logistic Regression and run the polr command. Its showing this error "attempt to find suitable starting values failed". I think my target variable is not ordered. Can anybody tell me how to arrange Sv of ordered values?

model <- polr(Class~., data= training, Hess = TRUE)

Error in polr(Class ~ ., data = training, Hess = TRUE) : attempt to find suitable starting values failed In addition: Warning messages: 1: glm.fit: algorithm did not converge 2: glm.fit: fitted probabilities numerically 0 or 1 occurred

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
Unknown
  • 145
  • 2
  • 11

2 Answers2

1

Please provide a reproducible data. Anyway, generating some data with an unordered dependent variable Class does not give me this error. See here:

# library the package for polr function
library("MASS")

# a sample size of 30
n <- 30

# generating a factor with smple size n and with a frequency for each level of n/3
Class <- factor(rep(c("HIGH", "LOW", "MEDIUM"), each= n/3))

# leaving it an unordered factor by using # (code not run)
# Class <- ordered(Class, levels= c("LOW", "MEDIUM", "HIGH")) 

# generating a data frame with two random variables
set.seed(1)
training <- data.frame(matrix(rnorm(2*n), ncol=2))

# adding the dependent variable Class to te data frame
training$Class <- Class

# running model
m <- polr(Class~., data= training, Hess = TRUE)

# look at coefficients and tests
library("AER") 
coeftest(m) 

This suggests that factor order is not the problem. And indeed, asking google showed me similar errors in glm, that are about convergance not about factor order. This maybe makes the question a duplicate. See here, for example Why am I getting "algorithm did not converge" and "fitted prob numerically 0 or 1" warnings with glm?

  • Can you please explain these commnads in detail?What's exactly happening? n <- 30 Class <- factor(rep(c("HIGH", "LOW", "MEDIUM"), each= n/3)) training <- data.frame(matrix(rnorm(2*n), ncol=2)) training$Class <- Class – Unknown Mar 31 '19 at 11:47
  • Becasue after these comands its only showing the summary of Class. I want to see the summary of all the varaible including their p-values so that I can identfiy non significant ones. Please do reply as I really need help fast. – Unknown Mar 31 '19 at 11:49
  • I added some explanation to the code. The details of the code are not important, because the code is only supposed to show that the error is probably not due to factor order. –  Mar 31 '19 at 12:30
  • Your question was about the error which to my understanding is answered by the link provided and seemingly is not a problem of factor order. If you now have another question you should post it separately. –  Mar 31 '19 at 12:31
  • Thanks alot for the help. One last question if you can answer that! You took two random variables in training. What If I want to include all of my 18 variables? Because when I split my data into training and testing the command polr does not run. – Unknown Mar 31 '19 at 18:05
  • In polr(Class~., data= training, Hess = TRUE) the "~." says that all variables appart the dependent variable (here: Class) of the data (here: training) are used as predictors. Thus, if you want other variables being predictors, you can specify another data than training. If this answers your question consider marking/ voting up my answer. –  Apr 01 '19 at 06:42
  • No. Sorry I didnt understand. – Unknown Apr 02 '19 at 07:05
  • If your dataframe (called df) contains only the dependend variable (called Class) and the independent variables (the names don't matter here) then the code would be: polr(Class~., data= df, Hess = TRUE). This code tries to fit Class to all other variables that are in dataframe df. If it is still unclear please provide the name of your dataframe and the names of your variables by writing what you see if you type names(dataframe) where "dataframe" is the name of your dataframe. –  Apr 02 '19 at 07:10
  • Hi These are all the variables in my dataframe. "Customer" "Customer.No" "Shop" "Invoice" "Quantity" "Sales" "Cash.Amt" "Credit.Card.Amt" "Net.Sales" "Mens.Wear" "Womens.Wear" "Kids.Wear" "Foot.Wear" "Fragrant" "Class" "Date" "Year" "Month" where Invoice,Quantity,Cash.amt,Credit.Card.amt,Womens.Wear,Kids.wear,Foot.wear,Fragrant, and Year are all int values.The remaining ones are factor. – Unknown Apr 02 '19 at 10:52
  • In this case polr(Class ~ Customer + Customer.No + Shop + Invoice + Quantity + Sales + Cash.Amt + Credit.Card.Amt + Net.Sales + Mens.Wear + Womens.Wear + Kids.Wear + Foot.Wear + Fragrant + Class + Date + Year + Month, data= df, Hess= TRUE) is the same as polr(Class ~., data= df, Hess= TRUE), assuming the name of your dataframe is df. –  Apr 02 '19 at 10:58
  • This command gives me this error. Can you please tell me what is this error and how to resolve it? polr(Class~., data= training, Hess = TRUE) -> reg Error: cannot allocate vector of size 181.9 Mb. I kind of serached and it says I dont have enough RAM to run this command. Any methods to resolve this? – Unknown Apr 02 '19 at 11:09
  • I know they both are same but the problem is that when I try to run this command (polr(Class~. , data=df,Hess=TRUE), it keeps on running and never executes. I waited for more than an hour and it was still running.Now if this command does not run, how am I going to proceed further? – Unknown Apr 02 '19 at 17:26
  • I'm sorry I can't help here. All I can do is asking google and so on which you can do, too. I wish you good look with the project. –  Apr 02 '19 at 20:15
  • Its okay. Thanks for the help. – Unknown Apr 03 '19 at 14:23
0

I got the same error message and googled myself sore. I am rather a rookie in the field of R. The solution was quite simple for me: I made the stupid mistake that I gave my dependent variable labels (= Likert scale labels), which no longer contained any computable values. To recognize this, a look into the data set was enough, but I did that very late. After I had read in the data without the labels, the model could be calculated. So if you get such an error message, you should perhaps first look at the data that was read in and to which the model refers and make sure that computable data is available.

Tollex
  • 1
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Feb 13 '23 at 18:03