1

I have built a linear regression i R. Now I wanna store the model and use it for scoring a new data set once a week.

Someone that can help me with how to?

How to save the model and how to import it and use it on an new dataset.

  • 5
    Possible duplicate of [Reusing a Model Built in R](http://stackoverflow.com/questions/5118074/reusing-a-model-built-in-r) – LyzandeR Nov 12 '15 at 17:27

2 Answers2

7

You can save the model in a file and load it when you need it.

For example, you might have a line like this to train your model:

the_model <- glm(my_formula, family=binomial(link='logit'),data=training_set)

This model can be saved with:

save(file="modelfile",the_model) #the name of the file is of course arbitrary

Later, assuming that the file is in the working directory, you can reuse that model by first loading it with

load(file="modelfile")

The model can then be applied to a (new) data set test_set like, e.g.,

test_set$pred <- predict(the_model, newdata=test_set, type='response')

Note that the name, in this case the_model should not be assigned to a variable (don't use something like the_model <- load("modelfile")). The model with its name becomes available with the load() function. Also, the model remains the same as it was before. The new observations are not changing the coefficients or anything in the model - the "old" model is applied to make predictions on new data.

If, however, you have an additional labeled set and you want to train / improve the model on the basis of these new observations, you can follow the suggestions in the answer by @David.

Hope this helps.

RHertel
  • 23,412
  • 5
  • 38
  • 64
3

You can use the update function:

set.seed(1)
dat <- data.frame(x = rnorm(100),
                  y = rnorm(100, 0.01))

lmobj <- lm(y~x, dat)

coef(lmobj)
# (Intercept)            x 
# -0.027692614 -0.001060386

dat2 <- data.frame(x = rnorm(10),
                   y = rnorm(10, 0.01))

lmobj2 <- update(lmobj, dat2)
coef(lmobj2)
# (Intercept)             y 
# 0.1088614395 -0.0009323697 

#--------------------------------
# to make things a bit more clear:
# lmobj2 is not the same as a new model such as the following
lmobj3 <- lm(y~x, dat2)
coef(lmobj3)
#(Intercept)           x 
#-0.02386837  0.06973995 
David
  • 9,216
  • 4
  • 45
  • 78
  • 1
    I think this is helpful to change / adapt the model according to a new data set. However, the way I understood the OP the goal is to use the *same*, trained model on a different data set. – RHertel Nov 12 '15 at 17:46