1

I have my response variable as proportions with values between 0 and 1, 0 and 1 not included. I would like to perform Bayesian logit regression. I am using the package arm in R and I followed the example in Bayesian Generalized Linear Models in R as published by Jon Starkweather, PhD. The difficulty or the confusion I have in mind is that with the frequentist glm approach, I could do beta regression (and specify logit link). But when it comes to the Bayesian glm, I am unsure how to specify the link function for this proportions data, especially using the routine provided in the arm package and as used in the above cited paper regarding the Bayesglm function. The adapted code I am using is as below:

#install.packages("arm")
library(arm)

Model<-bayesglm(y ~x1 + I(x1^2) + x2 + x3 + x4 + x5 + x6 
              + x7 + x8 + x9,family = gaussian, data=mydata,prior.mean=0, prior.scale=Inf, prior.df=Inf)
summary(Model)

Call:
bayesglm(formula = y ~x1 + I(x1^2) + x2 + x3 + x4 + x5 + x6 
              + x7 + x8 + x9, family = gaussian, data = panel1_neg, prior.mean = 0, 
             prior.scale = Inf, prior.df = Inf)

Deviance Residuals: 
      Min         1Q     Median         3Q        Max  
-0.024267  -0.006407  -0.001379   0.006257   0.042012  

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)    0.046806   0.011057   4.233 5.16e-05 ***
       x1      0.327205   0.084408   3.876 0.000191 ***
   I(x1^2)     -1.351503   0.395559  -3.417 0.000921 ***
      x2      -0.333285   0.056133  -5.937 4.30e-08 ***
      x3       0.074882   0.029916   2.503 0.013949 *  
      x4       0.012951   0.003231   4.009 0.000119 ***
      x5      -0.053934   0.059021  -0.914 0.363042    
      x6      -0.082908   0.051511  -1.610 0.110690    
      x7      -0.019248   0.068604  -0.281 0.779623    
      x8      -0.012700   0.002549  -4.981 2.68e-06 ***
      x9       0.006289   0.002575   2.442 0.016382 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 0.0001288981)

    Null deviance: 0.032699  on 109  degrees of freedom
Residual deviance: 0.012761  on  99  degrees of freedom
AIC: -660.64

Number of Fisher Scoring iterations: 7

So my question is, how do I specify a logit link in Bayesglm function? If the response variable were binary, I could specify family=binomial(link=logit).

Any assistance is highly appreciated.

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
Lomnewton
  • 35
  • 6
  • Not sure about `bayesglm`, but [`rstanarm` has a Beta regression implementation](https://cran.r-project.org/web/packages/rstanarm/vignettes/betareg.html). – merv Feb 08 '21 at 21:32

1 Answers1

0

The frequentist / bayesian terminology is a bit too confusing. Basically the question is how to run a binomial regression with either glm (from stats) or bayesglm (from arm )

Suppose our dataset is this, different levels of successes associated with different x, and n = 10 :

set.seed(111)
df = data.frame(success = c(rbinom(20,10,0.6),rbinom(20,10,0.6)),
x = rep(0:1,each=20))

df$n = 10

We calculate the proportion:

df$p = df$success / df$n

And you regress by using weights:

glm(p ~ x,weights=n,family=binomial(link=logit),data=df)

Call:  glm(formula = p ~ x, family = binomial, data = df, weights = n)

Coefficients:
(Intercept)            x  
     0.6411      -0.2356  

Degrees of Freedom: 39 Total (i.e. Null);  38 Residual
Null Deviance:      28.33 
Residual Deviance: 27.04    AIC: 137.5

Same for bayesglm :

bayesglm(p ~ x,weights=n,family=binomial(link=logit),data=df)

Call:  bayesglm(formula = p ~ x, family = binomial(link = logit), data = df, 
    weights = n)

Coefficients:
(Intercept)            x  
     0.6394      -0.2325  

Degrees of Freedom: 39 Total (i.e. Null);  38 Residual
Null Deviance:      28.33 
Residual Deviance: 27.04    AIC: 71.04

Also check out the accepted answer for this post

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
  • Hi @StupidWolf, thanks for your suggestion. But I am not sure if this resolves my problem. I am not running a Bayesian logit model for a binomial distributed dependent variable. My dependent variable is a ```rates``` variable calculated for example as ```a/b ``` where data on ```a``` and ```b``` are not available. So I am not sure what would constitute ```weights``` in this case and how I would use same. My understanding is that such rates variables are beta distributed. In short I am looking for a way to specify the family to be beta and the ```link``` to be ```logit```. – Lomnewton Feb 05 '21 at 23:38