1

I'm creating a model to investigate main effects and interactions between the total no of sharks ~ Month + SST + Sex. When I create the model for this test, the output only shows SexMale and not Sexfemale.

I understand one predictor is the intercept which has been accounted for, but Sexfemale does not appear when modelled alongside SST. Is there something I am missing?

  • 1
    Hi, welcome to *Stack Overflow*, in order that we can help you, please provide example data and the steps you've tried so far. Consider [*How to make a great reproducible example*](https://stackoverflow.com/a/5963610/6574038), thank you. – jay.sf Feb 23 '18 at 14:02
  • If your hope his to receive an answer, then you are invited to post your code, or, at least, a minimal and reproductible code. For the data, you can use the result of `dput(yourdata)` and add it to your question. In this way, others will just have to copy and paste, and they will be happy to help you. – MrSmithGoesToWashington Feb 23 '18 at 14:03
  • Thank you for your reply. I have two factors - Season and Sex, and temperature as a continuous variable. When I run the model for both effects and interactions, the output misses out both MonthAutumn and SexFemale plus the interactions between them. If they are being used as a baseline, how do I interpret this? As my hypothesis is to test these variables against the total number of individuals. I'm a relatively new R user *help* – Hannah Rose Milankovic Feb 24 '18 at 19:12

1 Answers1

4

The output is correct. If you have a factor variable, glm always uses n-1 interactions. In your case sexFemale is the baseline and sexMale will only be used if the sex variable = Male

EDIT based on comment of op

I created a very small reproducible example.

data <- data.frame(sharks = c(2,4,6,8,1,3,5,7),
                   season = c("spring", "spring", "summer", "summer", "autumn","autumn", "winter", "winter"),
                   sst = c(23,24,26,26,24,22,20,20),
                   sex = c("F", "F", "M", "M", "F", "M", "F", "F"))

# basic glm model
glm_mod <- glm(sharks ~ . , data = data)

Coefficients:
  (Intercept)  seasonspring  seasonsummer  seasonwinter  sst  sexM  
    -47             3            -4            13         2     6 

Interpretation: the baseline for the model is the autumn season and female sex. In other words, if it is autumn and the shark(?) is female the number of sharks is -47 + 2 * the temperature.

baseline: autumn + female because they are the first levels of the factor.

glm formula:
-47 + 3 * spring + -4 * summer + 13 * winter + 2 * sst + 6 * M

glm model with interactions between season and sex:

# glm model with interactions
glm_mod_interact <- glm(sharks ~ sst + season:sex , data = data)

Coefficients:
  (Intercept)                sst  seasonautumn:sexF  seasonspring:sexF  seasonsummer:sexF  seasonwinter:sexF  seasonautumn:sexM  
-45                  2                 -2                  1                 NA                 11                  4  
seasonspring:sexM  seasonsummer:sexM  seasonwinter:sexM  
NA                 NA                 NA  

The NA's are there because there is no data in the example data.frame for these combinations. But here you have all the interactions between sex and season. Whether this is significant you will have to figure out.

glm_mod_interact formula:
-45 + 2 * sst + -2 * seasonautumn:sexF + 1 * seasonspring:sexF + etc..

My advise is to read openintro statistics chapter 7 and further, or better yet, Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman and Jennifer Hill

Community
  • 1
  • 1
phiver
  • 23,048
  • 14
  • 44
  • 56
  • Thank you for your reply. If the output is correct how would I interpret this? Also one of the seasons is missing from the output as well.. but I understand this must be because it is another factor variable and it is using Autumn as the baseline. – Hannah Rose Milankovic Feb 24 '18 at 19:06
  • 1
    @HannahRoseMilankovicm, I extended my answer. – phiver Feb 25 '18 at 12:00