1

I tried to use glm for estimate soccer teams strengths.

# data is dataframe (structure on bottom). 
model <- glm(Goals ~ Home + Team + Opponent, family=poisson(link=log), data=data)

but get the error:

Error in if (any(y < 0)) stop("negative values not allowed for the 'Poisson' family") : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
In Ops.factor(y, 0) : ‘<’ not meaningful for factors

data:

> data
                      Team                 Opponent Goals Home
1 5a51f2589d39c31899cce9d9 5a51f2579d39c31899cce9ce     3    1
2 5a51f2579d39c31899cce9ce 5a51f2589d39c31899cce9d9     0    0
3 5a51f2589d39c31899cce9da 5a51f2579d39c31899cce9cd     3    1
4 5a51f2579d39c31899cce9cd 5a51f2589d39c31899cce9da     0    0

> is.factor(data$Goals)
[1] TRUE
Edgaras Karka
  • 7,400
  • 15
  • 61
  • 115
  • 1
    Hi Edgaras, with the current information it is difficult to help you solve your issue - except repeating the error; you seem to have factors where the function excepts numerical values. Please consider including a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), that would allow others to help you better. – Florian Apr 30 '18 at 11:15
  • 1
    You might want to check that your `Goals` variable isn't a factor, which can happen if your data was imported incorrectly. – Hong Ooi Apr 30 '18 at 11:38

1 Answers1

4

From the "details" section of documentation for glm() function:

A typical predictor has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response.

So you want to make sure your Goals column is numeric:

df <- data.frame( Team= c("5a51f2589d39c31899cce9d9", "5a51f2579d39c31899cce9ce", "5a51f2589d39c31899cce9da", "5a51f2579d39c31899cce9cd"),
                  Opponent=c("5a51f2579d39c31899cce9ce", "5a51f2589d39c31899cce9d9", "5a51f2579d39c31899cce9cd", "5a51f2589d39c31899cce9da "),
                  Goals=c(3,0,3,0),
                  Home=c(1,0,1,0))

str(df)
#'data.frame':  4 obs. of  4 variables:
# $ Team    : Factor w/ 4 levels "5a51f2579d39c31899cce9cd",..: 3 2 4 1
# $ Opponent: Factor w/ 4 levels "5a51f2579d39c31899cce9cd",..: 2 3 1 4
# $ Goals   : num  3 0 3 0
# $ Home    : num  1 0 1 0


model <- glm(Goals ~ Home + Team + Opponent, family=poisson(link=log), data=df)

Then here is the output:

> model


Call:  glm(formula = Goals ~ Home + Team + Opponent, family = poisson(link = log), 
    data = df)

Coefficients:
                      (Intercept)                               Home       Team5a51f2579d39c31899cce9ce  
                       -2.330e+01                          2.440e+01                         -3.089e-14  
     Team5a51f2589d39c31899cce9d9       Team5a51f2589d39c31899cce9da   Opponent5a51f2579d39c31899cce9ce  
                       -6.725e-15                                 NA                                 NA  
 Opponent5a51f2589d39c31899cce9d9  Opponent5a51f2589d39c31899cce9da   
                               NA                                 NA  

Degrees of Freedom: 3 Total (i.e. Null);  0 Residual
Null Deviance:      8.318 
Residual Deviance: 3.033e-10    AIC: 13.98
Katia
  • 3,784
  • 1
  • 14
  • 27