3

I am trying to run a negative binomial regression using the glmnet 4.0 package. I have implemented the regression using code from the section entitled 'Fitting Other GLMs' of this webpage. However, I keep getting the following error:

Error in seq.default(log(lambda_max), log(lambda_max * lambda.min.ratio), : 'from' must be a finite number

I haven’t been able to find examples of other people experiencing this error in the past. I think maybe because it is specific to this new version of the package?

Below is an example which should reproduce the error. This is not the data I have been using for my analysis and is simply for example purposes.

library(eventdataR)
library(glmnet)
library(MASS)

df <- subset(traffic_fines, activity == "Create Fine" | activity == "Add penalty" )
df <- df[,c(4,6,7,9,13,14,18)]
df$resource <- as.numeric(df$resource)
dfm <- as.matrix(df[,-3])

newfit <- glmnet(dfm, df$amount, family = negative.binomial(theta = 5))

Does anyone know why this error might be occurring and what I can do to stop it?

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
Misc584
  • 357
  • 3
  • 16
  • perhaps edit your second link above to get us to a document with pg 5. Otherwise not finding `negative.binomial` in the `glmnet` `glm` family tree. – Chris Jun 16 '20 at 14:05
  • Thanks for the pointer Chris. I have edited it to include an appropriate link. – Misc584 Jun 16 '20 at 15:10
  • so if you `debugonce(glmnet)` and step through debug `n`, you'll throw your error. At each `n` your can `ls()` and check values. What I don't see is how/where `lambda_max` is set. `lambda.min.ratio` has a default set, but seq.default can't start with `NULL`, even as log. – Chris Jun 16 '20 at 15:43
  • plenty of stuff here [glmnet computes lambda_max](https://stackoverflow.com/questions/25257780/how-does-glmnet-compute-the-maximal-lambda-value), and normalized values. – Chris Jun 16 '20 at 15:53
  • OK, thanks a lot Chris! I'll have a look into those solutions. – Misc584 Jun 17 '20 at 12:37
  • 1
    I see the same issue with all R defined `family` objects - even `gaussian()`. If anyone has a found an easy solution to this, I would love to understand what it is. – wdkrnls Dec 16 '20 at 16:16
  • The `gaussian()` family object does work for their example data. – wdkrnls Dec 16 '20 at 16:30
  • I've come across this issue when defining the family. I think the issue is that if any combination of independent variables are a linear combination of the intercept (i.e., 1) for all rows, then the error is encountered. I'm guessing lambda = 0 might be in the search space in those cases, which would make sense as to why the model failed (infinite solutions). I'm not 100% confident about this, though. – Max Candocia Mar 05 '21 at 00:29

1 Answers1

1

In the example you provided, there are no rows with no NAs,

table(complete.cases(df))

FALSE 
14635

If we chose some other columns:

df <- subset(traffic_fines, activity == "Create Fine" | activity == "Add penalty" )
df <- df[,c("points","article","amount","resource")]
df = df[complete.cases(df),]
df$resource <- as.numeric(df$resource)
dfm <- as.matrix(df[,-3])

It will run

newfit <- glmnet(dfm, df$amount, family = negative.binomial(theta = 5))

newfit

Call:  glmnet(x = dfm, y = df$amount, family = negative.binomial(theta = 5)) 

   Df  %Dev  Lambda
1   0  0.00 0.46180
2   1  8.23 0.42070
3   1 14.92 0.38340
4   1 20.42 0.34930
StupidWolf
  • 45,075
  • 17
  • 40
  • 72