1

I am using a standard lm() model in R with numeric variables and factors. For factors R give coeff for every levels but one, the one being 0.

Is it possible to choose this level?

For example, here is the output of my model:

Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
(Intercept)                     9.847e+00  1.499e-02 656.984   <2e-16 ***
base$km                        -3.343e-06  5.669e-08 -58.974   <2e-16 ***
log(base$nbJour + 1)            2.395e-02  1.743e-03  13.738   <2e-16 ***
id_boite2                 -5.980e-02  4.741e-03 -12.615   <2e-16 ***
cylindre2.0                1.125e-01  8.174e-03  13.762   <2e-16 ***
cylindre2.7                2.291e-01  1.056e-02  21.692   <2e-16 ***
cylindre3.0                3.393e-01  1.061e-02  31.970   <2e-16 ***

The variable id_boite can have 2 values, 1 or 2. By default R has set id_boite1 to 0 and id_boite2 to -5.980e-02. I want to know if it is possible to force it to set the other level to 0, or more globally to manage to set the level with the most negative effect to 0, in order to have all my coeff positive.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 2
    `contr.treatment` allows the base level to be specified via the `base=` argument. See the example in `?contr.treatment` . – G. Grothendieck Dec 12 '13 at 20:49
  • R did NOT "set" `id_boite2` to anything other than what it was. It estimated a difference in means between cases with `id_boite==1` and those with `id_boite==1` – IRTFM Dec 12 '13 at 22:47

2 Answers2

1

I think you're looking for the relevel() function. Before you ran your linear model (assuming a data frame named df), you would do:

df$id_boite = relevel(df$id_boite, ref=2)
josliber
  • 43,891
  • 12
  • 98
  • 133
0

You could use df <- transform(df,id_boite=relevel(id_boite,response_var)) (assuming your data are in a data frame df and that response_var is the response variable), which would set the factor levels in order of increasing (marginal) mean response. This wouldn't guarantee positive coefficients in a complex regression model where the conditional means associated with each level could be different than their marginal means, but it might work reasonably well in general.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453