0

When we run a linear regression in R, under what variable does R save the actual regression equation, by this I mean does R actually save the equation in the form:

y=B0 +B1x1 + B2x2 + B3x3 etc

I am asking because i would like to call upon that equation later, or would i need to create a new variable and let it equal to the above equation and at the same time include my beta values such that (for example) in R

z=0.1 + 0.2x1 +0.3x2 +0.4x3 etc.

I understand one can use predict function but I am not sure if that is what i am looking for exactly

John
  • 387
  • 2
  • 3
  • 14
  • 1
    `fm <- lm(demand ~ Time, BOD); formula(fm)` – G. Grothendieck Nov 10 '16 at 16:02
  • 1
    Other than using `predict`, the equation is not saved in the form you're describing, afaik. The closest may be `coef` from which you can construct the rest. – Pierre L Nov 10 '16 at 16:08
  • If your goal is to apply the estimated coefficients to a new dataset, you want to use `predict`. If you are not sure how to use this exactly have a look at the help file (if you are using `lm`, that would be `?predict.lm`) and other similar questions on SO, e.g. http://stackoverflow.com/questions/9028662/predict-maybe-im-not-understanding-it. If you want to use the formula for whatever other purpose, extract it and use it like G. Grothendieck described in his comment. – konvas Nov 10 '16 at 16:32
  • @ZheyuanLi will I be able to specify probability distributions for my predictors so that they will be used when i conduct predict? – John Nov 10 '16 at 17:24
  • @ZheyuanLi Yes. So i want to have for example an rnorm distribution for each of them (predictior variables only) and the values from these distribution I want to use in my regression equation – John Nov 10 '16 at 17:55
  • @ZheyuanLi Sorry that wasn't explained correctly, so i want to specify a distribution for my predictor which would be x in your example, from the original post it would be a distribution for all predictor variables such as x1, x2, x3 etc. After specifying a distribution i want to sample values from each of these distributions and plug them in to my regression equation – John Nov 10 '16 at 18:01
  • @ZheyuanLi i believe this is what i am looking for but i receive an error saying that my variables (x1, x2 etc) are not factors. I checked and the class is numeric. Does that mean I need to convert first? – John Nov 10 '16 at 18:16
  • @ZheyuanLi if i was to apply this I wouldn't be sampling based on a probability distribution which is what I am trying to do – John Nov 10 '16 at 19:53
  • @ZheyuanLi didnt mean for it to come across as ungrateful. This is all new to me so im just trying to understand. Thanks for the input – John Nov 10 '16 at 20:18

1 Answers1

1

If you want to get the coefficients, you use summary() on your lm.

To see just the model terms and their estimates, SEs, etc...

my_lm <- lm(Sepal.Length~Sepal.Width+Petal.Width+Petal.Length,iris)
coeffients <- summary(my_lm)$coefficients
coeffients
               Estimate Std. Error   t value     Pr(>|t|)
(Intercept)   1.8559975 0.25077711  7.400984 9.853855e-12
Sepal.Width   0.6508372 0.06664739  9.765380 1.199846e-17
Petal.Width  -0.5564827 0.12754795 -4.362929 2.412876e-05
Petal.Length  0.7091320 0.05671929 12.502483 7.656980e-25

You can then use however you like. Lastly, formula() will return the what you called for in the lm()

formula(my_lm)
Sepal.Length ~ Sepal.Width + Petal.Width + Petal.Length

If you don't want to use predict(), can use this object instead.

my_coef<-(coeffients[,1])
my_coef
 (Intercept)  Sepal.Width  Petal.Width Petal.Length 
   1.8559975    0.6508372   -0.5564827    0.7091320 
akaDrHouse
  • 2,190
  • 2
  • 20
  • 29