0

I am trying to build a linear regression model with interaction to predict Sepal.Length based on both Petal.Length and Species from iris. For some reason, the output of my model only shows two of the three Species.

Code:

mod1=lm(Sepal.Length~Petal.Length*Species,data=iris)
summary(mod1)$coef

Output:

                               Estimate Std. Error   t value     Pr(>|t|)
(Intercept)                     4.2131682  0.4074209 10.341071 4.331619e-19
Petal.Length                    0.5422926  0.2767667  1.959385 5.199902e-02
Speciesversicolor              -1.8056451  0.5984284 -3.017312 3.016413e-03
Speciesvirginica               -3.1535091  0.6340741 -4.973408 1.846894e-06
Petal.Length:Speciesversicolor  0.2859884  0.2950624  0.969247 3.340471e-01
Petal.Length:Speciesvirginica   0.4534460  0.2901455  1.562823 1.202893e-01

I have no clue why the other Species isn't showing up.

8i7ty8
  • 1
  • 4
    Yes that is because the you are using the treatment contrast, ie you have a base and comparing to that base(control group) which is set to zero. If you do not want that, then you can change the contrast or rather simply remove the intercept ie `lm(Sepal.Length~0+Petal.Length*Species,data=iris)` – Onyambu Dec 04 '21 at 01:25
  • 1
    The setosa coefficients are all zero but if you really want to see them anyways use `dummy.coef(mod1)`. – G. Grothendieck Dec 04 '21 at 15:35
  • see also https://stackoverflow.com/questions/41032858/lm-summary-not-display-all-factor-levels – StupidWolf Dec 05 '21 at 04:03

0 Answers0