I'm trying to perform a logistic regression with L-BFGS with R. Here is my dataset (390 obs. of 14 variables, Y is the target variable)
GEST DILATE EFFACE CONSIS CONTR MEMBRAN AGE STRAT GRAVID PARIT DIAB TRANSF GEMEL Y
31 3 100 3 1 2 26 3 1 0 2 2 1 1
28 8 0 3 1 2 25 3 1 0 2 1 2 1
31 3 100 3 2 2 28 3 2 0 2 1 1 1
...
This dataset is found here: http://tutoriels-data-mining.blogspot.fr/2008/04/rgression-logistique-binaire.html in "Données : prematures.xls". Y is a column I created with the column "PREMATURE" in Excel, Y=IF(PREMATURE="positif";1;0)
I've tried to use the optimx package like here https://stats.stackexchange.com/questions/17436/logistic-regression-with-lbfgs-solver, here is the code:
install.packages("optimx")
library(optimx)
vY = as.matrix(premature['Y'])
mX = as.matrix(premature[c('GEST','DILATE','EFFACE','CONSIS','CONTR','MEMBRAN','AGE','STRAT','GRAVID','PARIT','DIAB','TRANSF','GEMEL')])
#add an intercept to the predictor variables
mX = cbind(rep(1, nrow(mX)), mX)
#the number of variables and observations
iK = ncol(mX)
iN = nrow(mX)
#define the logistic transformation
logit = function(mX, vBeta) {
return(exp(mX %*% vBeta)/(1+ exp(mX %*% vBeta)) )}
# stable parametrisation of the log-likelihood function
logLikelihoodLogitStable = function(vBeta, mX, vY) {
return(-sum(
vY*(mX %*% vBeta - log(1+exp(mX %*% vBeta)))
+ (1-vY)*(-log(1 + exp(mX %*% vBeta)))
) # sum
) # return
}
# score function
likelihoodScore = function(vBeta, mX, vY) {
return(t(mX) %*% (logit(mX, vBeta) - vY) )
}
# initial set of parameters (arbitrary starting parameters)
vBeta0 = c(10, -0.1, -0.3, 0.001, 0.01, 0.01, 0.001, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01)
optimLogitLBFGS = optimx(vBeta0, logLikelihoodLogitStable, method = 'L-BFGS-B',gr = likelihoodScore, mX = mX, vY = vY, hessian=TRUE)
Here is the error :
Error in optimx.check(par, optcfg$ufn, optcfg$ugr, optcfg$uhess, lower, : Cannot evaluate function at initial parameters