6

I am comparing GLM output from R and SAS of a model with a Gamma distribution. The point estimations are identical, but they have different estimation of the standard error and therefore different p-values.

Does anyone know why? I am wondering if R and SAS uses different methods to estimate the standard errors? Maybe MLE vs. method of moments?

R sample code

set.seed(2)
test = data.table(y = rnorm(100, 1000, 100), x1 = rnorm(100, 50, 20), x2 = rgamma(100, 0.01))
model = summary(glm(formula = y ~ x1+x2 , family = Gamma(link = "log"), data = test))

Using the same data generated here I used the following code to run a model in SAS:

proc genmod data= test_data;
                model y =  x1 x2 /link= log dist= gamma;
    run;

The output from R is as following:

Call:
glm(formula = y ~ x1 + x2, family = Gamma(link = "log"), data = test)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-0.26213  -0.08456  -0.01033   0.08364   0.20878  

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  6.9210757  0.0324674 213.170   <2e-16 ***
x1          -0.0003371  0.0005985  -0.563    0.575    
x2           0.0234097  0.0627251   0.373    0.710    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.01376099)

    Null deviance: 1.3498  on 99  degrees of freedom
Residual deviance: 1.3436  on 97  degrees of freedom
AIC: 1240.6

Number of Fisher Scoring iterations: 4

The output from SAS:enter image description here

YDao
  • 325
  • 2
  • 15
  • [The SCALE parameter used in PROC GENMOD is the inverse of the gamma dispersion parameter](https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_genmod_sect055.htm) , so from `1/model$dispersion` , there is a difference there, hence in se's. In r it is calculated by `sum((object$weights * object$residuals^2)[object$weights > 0])/ object$df.residual` (look at `summary.glm`) Maybe try and find how SAS calculates it. – user20650 Jun 16 '17 at 00:02
  • SAS manual : The GENMOD Procedure - SAS Support - suggets there may be an option on how the scale is used. Try using SCALE=PEARSON in the MODEL statement (untested) – user20650 Jun 16 '17 at 00:30
  • I'm not seeing a material difference in "standard errors" or the deviance estimates. The p-values only differ in the range of ~0.001 which seems trivial. The code is available in R. From SAS, well you know how that goes, right? SAS is infamous for using "type III" calculations while R uses sequential estimates. – IRTFM Jun 16 '17 at 00:39
  • I understand that the difference in SEs and p-values are very small, but I am just trying to understand why there is a difference at all if they are both using MLE to estimate the results. I have looked into the different ways that they calculate dispersion parameters, and it seems to be the main issue. – YDao Jun 16 '17 at 14:54
  • Looks like SAS is using a normal approximation to calculate the Chi-square values. What does R do? – david25272 Jun 20 '17 at 02:39

1 Answers1

2

By default R computes the scale=1/dispersion parameter in the same way as the sas/genmod/model option scale=pearson. The choice of scale parameter affects the SEs. See the documentation here: https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_genmod_sect022.htm

By default SAS/genmod gives MLEs of the shape parameter. Suppose the fitted gamma model is stored in the list "fit". To get this in R load the MASS library then type

gamma.shape(fit)

This gives the MLEs of shape parameter alpha. If you then type

summary(fit, dispersion=1/gamma.shape(fit)$alpha))

the summary function will use the MLE of alpha when computing the SEs, and they will match SAS/genmod exactly.

I will make a separate post on this. While summary.glm gives the correct SEs (using the specified value of dispersion), it does not print the correct value of AIC (It does not use the specified dispersion, instead it uses the value computed with the Pearson residuals). The difference is small, but I would call it a bug.