0

A similar question in How to write a double for loop in r with choosing maximal element in one loop?.

The same setup:

If I want to sample theta[j] as first for j=1,2,...,71, then draw replicated( like 1000 times) yrep[k] form Bin(n[j], theta[j]), n[j] is known.

For theta[1], we have yrep[1,1], yrep[1,2], ..., yrep[1,1000]. Then for all theta[j], we will have a matrix of data set of yrep[i,j], i=1,...,71, j=1,..,1000.Then compute mean, max or min of each column yrep[1,1], yrep[1,2], yrep[1,3], ... yrep[1,71], we will get 1000 mean, max or min.

How to write this for loop?

I first try to write a loop to sample theta[j] and yrep. I do not know how to add a code to compute the maximal, mean, and minimal in this loop. I am not sure if this code is right:

theta<-NULL
yrep<-NULL
test<-NULL
k=1
for(i in 1:1000){
  for(j in 1:71){
    theta[j] <- rbeta(1,samp_A+y[j], samp_B+n[j]-y[j])
    yrep[k]<-rbinom(1, n[j], theta[j])
    k=k+1
  }
  t<-c(test, max(yrep))
}

Data is given in How to write a double for loop in r with choosing maximal element in one loop?:

   #Data
  y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,
   2,1,5,2,5,3,2,7,7,3,3,2,9,10,4,4,4,4,4,4,4,10,4,4,4,5,11,12,
   5,5,6,5,6,6,6,6,16,15,15,9,4)
  n <- 
   c(20,20,20,20,20,20,20,19,19,19,19,18,18,17,20,20,20,20,19,19,18,18,25,24,
   23,20,20,20,20,20,20,10,49,19,46,27,17,49,47,20,20,13,48,50,20,20,20,20,
   20,20,20,48,19,19,19,22,46,49,20,20,23,19,22,20,20,20,52,46,47,24,14)


  #Evaluate densities in grid
  x <- seq(0.0001, 0.9999, length.out = 1000)


  #Compute the marginal posterior of alpha and beta in hierarchical model Use grid

  A <- seq(0.5, 15, length.out = 100)
  B <- seq(0.3, 45, length.out = 100)

  #Make vectors that contain all pairwise combinations of A and B

  cA <- rep(A, each = length(B))
  cB <- rep(B, length(A))

 #Use logarithms for numerical accuracy!

 lpfun <- function(a, b, y, n) log(a+b)*(-5/2) +
  sum(lgamma(a+b)-lgamma(a)-lgamma(b)+lgamma(a+y)+lgamma(b+n-y)- 
   lgamma(a+b+n))
lp <- mapply(lpfun, cA, cB, MoreArgs = list(y, n))

 #Subtract maximum value to avoid over/underflow in exponentiation

 df_marg <- data.frame(x = cA, y = cB, p = exp(lp - max(lp)))

 #Sample from the grid (with replacement)

  nsamp <- 100
  samp_indices <- sample(length(df_marg$p), size = nsamp,
                   replace = T, prob = df_marg$p/sum(df_marg$p))
  samp_A <- cA[samp_indices[1:nsamp]]
  samp_B <- cB[samp_indices[1:nsamp]]
   df_psamp <- mapply(function(a, b, x) dbeta(x, a, b),
               samp_A, samp_B, MoreArgs = list(x = x)) %>%
   as.data.frame() %>% cbind(x) %>% gather(ind, p, -x)
Hermi
  • 357
  • 1
  • 11

1 Answers1

0

This is not very well tested.
There is no need for loops to sample from distributions included in base R, those functions are vectorized on their arguments. Code following the lines below should be able to do what the question asks for.

Ni <- 1000
Nj <- 17

theta <- rbeta(Ni*Nj, rep(samp_A + y, each = Ni), rep(samp_B + n - y, each = Ni))
yrep <- rbinom(Ni*Nj, n, theta)
test1 <- matrix(yrep, nrow = Ni)
mins1 <- matrixStats::colMins(test1)
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66