2

I have a logistic stored model. I want to remove a variable from the formula, without running the model again (I want to keep the rest of the coefficients).

My model:

> class(lr)
[1] "glm" "lm" 

And the formula is:

> lr$formula
target ~ grupoAntig + nu_seguros_1TRUNC + cd_sexo + grupoEdad + 
    vl_limite_aeQU + vl_ltd_6QU + Revolv3 + nu_servicios_1TRUNC + 
    fl_cliente_hit + nu_resumen_6 + fl_rv

I want to remove fl_cliente_hit.

I did:

a<-strsplit(as.character(lr$formula)[3], "+ ")
a<-a[[1]][a[[1]]!=a[[1]][17]]
a<-a[a!=a[16]]
a0<-paste(a, collapse = ' + ')

lr$formula<-as.formula(paste0("target ~ ",a0))

And I get the desired formula:

> lr$formula
target ~ grupoAntig + nu_seguros_1TRUNC + cd_sexo + grupoEdad + 
    vl_limite_aeQU + vl_ltd_6QU + Revolv3 + nu_servicios_1TRUNC + 
    nu_resumen_6 + fl_rv

But I'm not sure if I run a predict function if the model uses the new formula or the previous one. In this case:

predict(lr, train, type=c("response"))

If it keeps the original model, there must be a way to exclude one variable keeping the rest equal. How can I do this?

GabyLP
  • 3,649
  • 7
  • 45
  • 66
  • 3
    Your objective doesn't make sense. If you remove a parameter, the other parameters of the model must change. Normally, you would use the `update` function to remove variables, but of course that refits the model, which you claim to not want. – Roland Nov 11 '15 at 18:33
  • I'd also have thought of `update`, e.g. `f <- as.formula(mpg~disp+drat); f <- update(f,.~. - drat); fit <- lm(f, mtcars)`. – lukeA Nov 11 '15 at 18:35
  • @Roland is right. If you change the formula it doesn't mean that the arleady built model changes as well by itself. You have to re-build the model using your new (updated) formula. It's another story if you want (for some reason) to remove a variable from a model's output keeping the rest of the variables/coefficient the same. – AntoniosK Nov 11 '15 at 18:41
  • @Roland, I understand what you are saying and I thought it too. The thing is that I have a model trained in april to july, and this variable splits 10%, when I test the model in october this variable only splits 1%. So I thought that deleting it would keep my KS in october (although I would loose KS in training), (the fact is that I loose only 0.5 in training but 1.5 in october out of sample). So I thought that if I keep the original coefficients for the rest of the variables I might loose less KS. I have to test it. It's like the original coefficients might be better for out of sample data. – GabyLP Nov 11 '15 at 18:51
  • 1
    You could predict by manually specifying the coefficient ([an example here](http://stackoverflow.com/questions/25695565/predict-with-arbitrary-coefficients-in-r) ) (but id make sure that it is statistically sensible) – user20650 Nov 11 '15 at 19:07

1 Answers1

0

Late to the party, but I too was hoping there would be an answer to this. What I plan on doing if I can't find another solution is creating a dummy variable with the value of the coef. That way it at least negates the variable:

train$fl_cliente_hit<-coef(lr)["fl_cliente_hit"]
predict(lr, train, type=c("response"))