0

I would like to estimate the alpha and beta parameters from a beta distribution when all I have is a set of probabilities. I cannot seem to find an example of this online. Below I try three approaches: the betareg package, Ben Bolker's bbmle package and optim in R.

The betareg package returns phi rather than alpha and beta. It also returns a warning. I think the bbmle package does what I want, although I seem to have to jerryrig a data.frame to obtain the estimates. My optim function is not returning good estimates and I do not know how to correct it.

I am hoping someone can verify that what I am doing with the bbmle package is okay and maybe show me how to correct my optim function. Below I create a data set of probabilities and show my R code for all three approaches.

Create a data set of probabilities:

set.seed(1234)
# generate data
n.samples <- 1000
alpha <- 2
beta  <- 2
my.probs <- rbeta(n.samples, alpha, beta)
mean(my.probs)
#[1] 0.4925867

Here is my R code using the betareg package. I am aware that µ = p/(p + q) and φ = p + q and I think p = alpha and q = beta. I am not sure how to obtain estimates of alpha and beta from an estimate of phi. But given I have two equations with two unknowns I can probably define p = φ - q and substitute that into the equation for µ.

# install.packages('betareg')
library(betareg)

my.model <- betareg(formula = my.probs ~ 1)
summary(my.model)
# Call:
# betareg(formula = my.probs ~ 1)
# 
# Standardized weighted residuals 2:
#     Min      1Q  Median      3Q     Max 
# -3.8992 -0.6445 -0.0096  0.6284  3.1924 
# 
# Coefficients (mean model with logit link):
#             Estimate Std. Error z value Pr(>|z|)
# (Intercept) -0.02684    0.02740   -0.98    0.327
# 
# Phi coefficients (precision model with identity link):
#       Estimate Std. Error z value Pr(>|z|)    
# (phi)   4.1774     0.1686   24.77   <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
# 
# Type of estimator: ML (maximum likelihood)
# Log-likelihood: 138.9 on 2 Df
# Number of iterations: 10 (BFGS) + 2 (Fisher scoring) 
# Warning message:
# In deparse(x$call, width.cutoff = floor(getOption("width") * 0.85)) :
#   invalid 'cutoff' value for 'deparse', using default

exp(-0.02684) / (1 + exp(-0.02684))
#[1] 0.4932904

Here is my R code for the bbmle package. I define a placeholder data.frame of x and y where each is always 1. These seems to work, but I am not sure it is okay to do. I get warnings.

alpha and beta estimates for beta binomial and beta distributions

# install.packages("bbmle")
library("bbmle")

x <- rep(1, length(my.probs))
y <- rep(1, length(my.probs))
my.model <- mle2(minuslogl = my.probs ~ dbeta(shape1, shape2), start = list(shape1 = 2, shape2 = 30), data = data.frame(x, y))
# There were 12 warnings (use warnings() to see them)
summary(my.model)

# Maximum likelihood estimation
# 
# Call:
# mle2(minuslogl = my.probs ~ dbeta(shape1, shape2), start = list(shape1 = 2, 
#     shape2 = 30), data = data.frame(x, y))
# 
# Coefficients:
#        Estimate Std. Error z value     Pr(z)    
# shape1 2.060660   0.087702  23.496 < 2.2e-16 ***
# shape2 2.116721   0.090365  23.424 < 2.2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# -2 log L: -277.8204

Here is my optim function. I admit I am kind of guessing here as I have no experience with beta estimation. The estimates do not appear to be good and I get warnings.

beta.reg = function(betas, my.probs){
     b0    = betas[1]
     alpha = betas[2]
     beta  = betas[3]
     n <- length(my.probs)
     llh <- rep(0, n)

     for(i in 1:n){
          r   = my.probs[i] - b0
          llh[i] = llh[i] + r^2
     }

      -sum(dbeta(llh, alpha, beta, log = TRUE))
}

alpha.int <- alpha+1
beta.int  <- beta-1

my.model <- optim(c(mean(my.probs), alpha.int, beta.int), 
                 beta.reg, my.probs = my.probs, method = "BFGS", hessian=TRUE)
# There were 41 warnings (use warnings() to see them)

my.model$par
#[1]  0.4901505  0.5561770 11.0689391

mean(my.probs)

my.model$value
#[1] -2170.605

covmat <- solve(my.model$hessian)

mySE <- diag(covmat)^0.5
mySE
#[1] 0.0007311783 0.0207619704 0.6066361917

Thank you for any help with this.

Mark Miller
  • 12,483
  • 23
  • 78
  • 132

1 Answers1

1

I think I have managed to obtain virtually identical estimates of alpha and beta with optim, the R package bbmle and the R package betareg. However, with betareg I probably need to use the delta method to get the SE of alpha and beta.

First I generate some data, this time using different values of alpha and beta than used in my original post.

set.seed(4321)
n.samples <- 1000
alpha <- 3
beta  <- 5
my.probs <- rbeta(n.samples, alpha, beta)
mean(my.probs)
#[1] 0.3808036

Here is my R code with optim. I no longer attempt to estimate b0. I do still get warnings.

beta.reg = function(betas, my.probs){
     alpha = betas[1]
     beta  = betas[2]
     n <- length(my.probs)
     llh <- rep(0, n)

     for(i in 1:n){
          llh[i] = llh[i] + dbeta(my.probs[i], alpha, beta, log = TRUE)
     }

     -sum(llh)
}

alpha.int <- alpha+1
beta.int  <- beta-1

my.model <- optim(c(alpha.int, beta.int), 
                 beta.reg, my.probs = my.probs, method = "BFGS", hessian=TRUE)
#There were 50 or more warnings (use warnings() to see the first 50)
my.model$par
#[1] 3.058483 4.979886

mean(my.probs)

my.model$value
#[1] -428.0194

2 * my.model$value
#[1] -856.0388

covmat <- solve(my.model$hessian)

mySE <- diag(covmat)^0.5
mySE
#[1] 0.1308876 0.2199231

Here is the R code for the bbmle package. I still get warnings and I still must use a placeholder data.frame. However, the point estimates, SE's and the log-likelihood are virtually identical to those estimated above by optim.

# install.packages("bbmle")
library("bbmle")

x <- rep(1, length(my.probs))
y <- rep(1, length(my.probs))
my.model <- mle2(my.probs ~ dbeta(shape1, shape2),
                 start = list(shape1 = 4, shape2 = 4),
                 data = data.frame(x, y))
# Warning messages:
# 1: In dbeta(x = c(0.295375313127293, 0.332344598431757, 0.525281773592328,  :
#   NaNs produced
# 2: In dbeta(x = c(0.295375313127293, 0.332344598431757, 0.525281773592328,  :
#   NaNs produced
# 3: In dbeta(x = c(0.295375313127293, 0.332344598431757, 0.525281773592328,  :
#   NaNs produced
# 4: In dbeta(x = c(0.295375313127293, 0.332344598431757, 0.525281773592328,  :
#   NaNs produced

summary(my.model)

# Maximum likelihood estimation
# 
# Call:
# mle2(minuslogl = my.probs ~ dbeta(shape1, shape2), start = list(shape1 = 4, 
# shape2 = 4), data = data.frame(x, y)) 
# 
# Coefficients:
#        Estimate Std. Error z value     Pr(z)    
# shape1  3.05848    0.13089  23.367 < 2.2e-16 ***
# shape2  4.97989    0.21992  22.644 < 2.2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# -2 log L: -856.0388

Here is the R code for betareg. I still get warnings with this approach as well. This time I actually convert φ and µ into alpha and beta but I do not attempt to obtain their SE's.

# install.packages('betareg')
library(betareg)

my.model <- betareg(formula = my.probs ~ 1)
summary(my.model)
# Call:
# betareg(formula = my.probs ~ 1)
# 
# Standardized weighted residuals 2:
#     Min      1Q  Median      3Q     Max 
# -3.8388 -0.6244  0.0350  0.6631  2.9274 
# 
# Coefficients (mean model with logit link):
#             Estimate Std. Error z value Pr(>|z|)    
# (Intercept) -0.48749    0.02157   -22.6   <2e-16 ***
# 
# Phi coefficients (precision model with identity link):
#       Estimate Std. Error z value Pr(>|z|)    
# (phi)   8.0384     0.3406    23.6   <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
# 
# Type of estimator: ML (maximum likelihood)
# Log-likelihood:   428 on 2 Df
# Number of iterations: 10 (BFGS) + 1 (Fisher scoring) 
# Warning message:
# In deparse(x$call, width.cutoff = floor(getOption("width") * 0.85)) :
#   invalid 'cutoff' value for 'deparse', using default


# Convert phi and µ to alpha and beta
# µ = p/(p + q) and φ = p + q
#
# p = φ - q

my.mean <- exp(-0.48749) / (1 + exp(-0.48749))
my.mean
#[1] 0.380485

my.phi <- 8.0384

my.est.q <- -((my.mean * my.phi) - my.phi)
my.est.q
#[1] 4.979909

my.est.p <- my.phi - my.est.q
my.est.p
#[1] 3.058491

EDIT - August 27, 2023

I just discovered the fitdistr function in the MASS package. This function returns virtually the same estimates of alpha and beta as the other approaches above. It also returns warnings but I think I can ignore those.

**# generate data
set.seed(4321)
n.samples <- 1000
alpha <- 3
beta  <- 5
my.probs <- rbeta(n.samples, alpha, beta)
mean(my.probs)
#[1] 0.3808036
library(MASS)
beta_par <- fitdistr(my.probs, dbeta, 
                     start = list(shape1 = 1, shape2 = 1))
# Warning messages:
# 1: In densfun(x, parm[1], parm[2], ...) : NaNs produced
# 2: In densfun(x, parm[1], parm[2], ...) : NaNs produced
# 3: In densfun(x, parm[1], parm[2], ...) : NaNs produced
# 4: In densfun(x, parm[1], parm[2], ...) : NaNs produced
beta_par
#    shape1      shape2  
#  3.0584817   4.9798850 
# (0.1308876) (0.2199230)**

EDIT - August 29, 2023

Here is a nice write-up on Cross Validated about a package called fitdistrplus. It returns estimates very close to those returned above using other approaches and does not return any warnings.

https://stats.stackexchange.com/questions/376634/how-to-pick-starting-parameters-for-massfitdist-with-the-beta-distribution/376638#376638

# generate data
set.seed(4321)
n.samples <- 1000
alpha <- 3
beta  <- 5
my.probs <- rbeta(n.samples, alpha, beta)
mean(my.probs)
#[1] 0.3808036

# install.packages('fitdistrplus')
library(fitdistrplus)
fitdist(my.probs, "beta")

fitdist(my.probs, "beta")
# Fitting of the distribution ' beta ' by maximum likelihood 
# Parameters:
#        estimate Std. Error
# shape1 3.058574  0.1308916
# shape2 4.980256  0.2199399
Mark Miller
  • 12,483
  • 23
  • 78
  • 132
  • you don't need to indicate edits in your answer - just make sure the final (edited) version is coherent. – Ben Bolker Aug 29 '23 at 20:06
  • 1
    PS doing beta regression on the (alpha, beta) scale seems a little weird (i.e. `alpha = X_alpha %*% beta_alpha`, `beta = X_beta %*% beta_beta`), if that's what you had in mind? There's a reason that beta regression is usually done on a mean/logit-link + dispersion/log-link scale ... – Ben Bolker Aug 29 '23 at 20:16
  • @BenBolker Thanks, Ben. I have never used beta regression. My next step is to figure it out. So, I removed that last statement from my answer. – Mark Miller Aug 30 '23 at 00:47