1

I'm using betareg from the betareg package for one of my projects. I'm wondering what is the requirement for using betareg in R? I'm now having int, character and numeric data as explanatory variables and proportion data as response variable. But I keep getting this error message:

Error in optim(par = start, fn = loglikfun, gr = gradfun, method = method, : non-finite value supplied by optim

Can anyone help with this? Sample dataset:

d <- structure(list(Col1 = c(2, 2, 5, 7), 
                    Col2 = c("ABC", "CBD", "BBD", "IOD"), 
                    Col3 = c("AB2", "CK5", "CD3", "JO9"), 
                    Col4 = c(121, 122, 2, 1), 
                    Col5 = c(NA, NA, "H2Y", "KIJ_H3I"), 
                    Col6 = c(30, 34, NA, NA), 
                    Col7 = c("Yes_NO(12:20)", "OK_oh(23:20)", "Yes_OKOK", "KJ_34"), 
                    Col8 = c(0.4, 0.25, 0.3, 0.6)), 
                    class = "data.frame", row.names = c(NA, -4L)
               )

The response variable is proportion from 0 to 1

Code that generate this error:

betareg(responsevariable ~., data= data, 
                link = "logit",link.phi = "log", model = T,type = "ML",weights = col1)
camille
  • 16,432
  • 18
  • 38
  • 60
123 456
  • 21
  • 2
  • Hi, welcome to StackOverflow. The best approach to asking questions here is to provide a small dataset and accompanying code that lets people reproduce the error that you get. This makes it much easier to understand what the problem is, and how to solve it. Your problem may arise due to a multitude of reasons. Some instructions to create a small, reproducible set can be found here: https://stackoverflow.com/a/5963610/5805670 – slamballais May 15 '21 at 10:15
  • Hello here is the sample dataset – 123 456 May 15 '21 at 10:37
  • Col1: int 2 2 5 7 Col2: chr ABC CBD BBD IOD Col3: chr AB2 CK5 CD3 JO9 Col4: int 121 122 2 1 Col5: chr (blank) (blank) H2Y KIJ_H3I Col6: int 30 34 (blank) (blank) Col7: chr Yes_NO(12:20) OK_oh(23:20) Yes_OKOK KJ_34 Col8: num 0.4 0.25 0.3 0.6 – 123 456 May 15 '21 at 10:37
  • Alright, and can you add the code that you use that produces the error? – slamballais May 15 '21 at 10:50
  • betareg(responsevariable ~., data= data, link = "logit",link.phi = "log", model = T,type = "ML",weights = col1) – 123 456 May 15 '21 at 10:56
  • We can't really answer this yet. I assume `Col8` is supposed to be the response variable (and that the weights were `Col1` (capitalized). In order to get farther I had to use `Col8 ~ . - Col5`, because `Col5` has a factor that ends up with no levels. With that done, however, the data you've shown us will end up with only two rows being fed to the internal machinery (because the rows with `NA` values in `Col6` will be discarded). Even using `Col8 ~ . - Col5 - Col6` we end up with the same error, but now that's (presumably) because we're trying to fit a model with far more predictor variables. – Ben Bolker May 16 '21 at 23:33
  • **tl;dr**; we need a *realistically large* data set to work with. (If you are indeed trying to fit this model to these data [4 data points with 12 internal predictor variables once dummies are constructed for the factor variables], you need to tell us why it is reasonable to expect that to work ...) – Ben Bolker May 16 '21 at 23:36
  • At the very least, can you tell us how many observations you have in total, and how many numeric/categorical predictors [ideally, approximately how many unique values the categorical predictors have] ? – Ben Bolker May 16 '21 at 23:45
  • I have around 500 observations and around 35 variables. I converted all the variables except for the weights to num. So many of the categorical predictors are excluded since they are NAs now. All the columns that contains NAs are not able to let this code run so I excluded all of them. – 123 456 May 17 '21 at 11:11
  • I'm still thinking about how to make the categorical variable and the variables that contains NAs to be included in this model – 123 456 May 17 '21 at 11:12

0 Answers0