0

For a better understanding of inner workings and a simple road to further experimentation, I'm trying to implement a (regularized) logistic regression on R.

It is known that one way of implementing it is by solving the following problem:

$$ \arg \min_{\beta} \sum(-y_i ( x_i' \beta) + log(1 + e^{x_i' \beta})) + \lambda \|\beta\| $$

However, for any larger value of $x_i' \beta$ the exp explodes, returning an Inf value. One solution that I have found is by using Rmpfr::mpfr and adding some precision, i.e.,

x <- mpfr(x, precBits = 106)

But this adds quite an overhead performance wise. On the other hand, base implementation of glm or glmnet manages to get the solution rather quickly. How is it so? What are the possible ways of avoiding the calculation of exp(x %*% beta)? Or is this achieved only by reducing to some other implementation, i.e., in C or Fortran?

Note: I'm trying to do this using Rsolnp::solnp Augmented Lagrangian optimizer with certain constraints on the parameters. For simple regression problems this seems OK performance wise due to the ability to add a gradient, however for logistic regression this might not be too great?

Rsolnp:::solnp(pars = beta0, fun = logistic_optimizer,  
                     eqfun = constraint,
                     ineqfun = positive_constraint,
                     ineqLB = rep(0, length(beta0)),
                     ineqUB =rep(Inf, length(beta0)),
                     LB = rep(0, length(beta0)),
                     UB = rep(Inf, length(beta0))
                     )

It would be nice to know if there are more efficient ways to manually solve this, without reducing to known libraries such as glm or glmnet, since I want the control over logistic_optimizer expression.

runr
  • 1,142
  • 1
  • 9
  • 25
  • Just a comment that, if you aren't doing that yet, you may want to try to standardize your data first. – Julius Vainora Oct 29 '18 at 12:31
  • This helps in most cases, I just though that, unless ``glm`` is doing that somewhere inside the code, there might be a solution without standardization. ``glmnet`` with ``standardize=FALSE`` also doesn't break. – runr Oct 29 '18 at 12:42
  • Anyway, in both cases I'm getting a ``Problem Inverting Hessian``, even though ``det(cov(x))`` is not bad and ``glm`` is also fine with it. Probably will have to re-ask the question for the Hessian. Thanks anyway! – runr Oct 29 '18 at 12:44

1 Answers1

0

I suggest to check out the function optim {stats}. See this post for a LASSO implementation:

How can I implement lasso in R using optim function

and for a GLM implementation:

Applying Cost Functions in R

In addition, optim is written partly in Mortran (extension of Fortran), so it will not be that easy to modify the optim functions for custom functions, unless you are a Fortran Wizard.

nadizan
  • 1,323
  • 10
  • 23
  • Thanks, this is useful. Still, for the LASSO part, can we actually do it with ``optim``? As far as I know, due to the nature of LASSO penalty, we need to split ``beta`` into ``beta_pos = (beta + abs(beta)) / 2`` and ``beta_neg = (beta -abs(beta)) / 2`` to ensure differentiability, thus requiring additional constraints. Didn't find such constraints on ``optim`` – runr Oct 29 '18 at 13:02