How to write a for loop to compute max of each column for a dataset in R?

Question

A similar question in How to write a double for loop in r with choosing maximal element in one loop?.

The same setup:

If I want to sample theta[j] as first for j=1,2,...,71, then draw replicated( like 1000 times) yrep[k] form Bin(n[j], theta[j]), n[j] is known.

For theta[1], we have yrep[1,1], yrep[1,2], ..., yrep[1,1000]. Then for all theta[j], we will have a matrix of data set of yrep[i,j], i=1,...,71, j=1,..,1000.Then compute mean, max or min of each column yrep[1,1], yrep[1,2], yrep[1,3], ... yrep[1,71], we will get 1000 mean, max or min.

How to write this for loop?

I first try to write a loop to sample theta[j] and yrep. I do not know how to add a code to compute the maximal, mean, and minimal in this loop. I am not sure if this code is right:

theta<-NULL
yrep<-NULL
test<-NULL
k=1
for(i in 1:1000){
  for(j in 1:71){
    theta[j] <- rbeta(1,samp_A+y[j], samp_B+n[j]-y[j])
    yrep[k]<-rbinom(1, n[j], theta[j])
    k=k+1
  }
  t<-c(test, max(yrep))
}

Data is given in How to write a double for loop in r with choosing maximal element in one loop?:

   #Data
  y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,
   2,1,5,2,5,3,2,7,7,3,3,2,9,10,4,4,4,4,4,4,4,10,4,4,4,5,11,12,
   5,5,6,5,6,6,6,6,16,15,15,9,4)
  n <- 
   c(20,20,20,20,20,20,20,19,19,19,19,18,18,17,20,20,20,20,19,19,18,18,25,24,
   23,20,20,20,20,20,20,10,49,19,46,27,17,49,47,20,20,13,48,50,20,20,20,20,
   20,20,20,48,19,19,19,22,46,49,20,20,23,19,22,20,20,20,52,46,47,24,14)


  #Evaluate densities in grid
  x <- seq(0.0001, 0.9999, length.out = 1000)


  #Compute the marginal posterior of alpha and beta in hierarchical model Use grid

  A <- seq(0.5, 15, length.out = 100)
  B <- seq(0.3, 45, length.out = 100)

  #Make vectors that contain all pairwise combinations of A and B

  cA <- rep(A, each = length(B))
  cB <- rep(B, length(A))

 #Use logarithms for numerical accuracy!

 lpfun <- function(a, b, y, n) log(a+b)*(-5/2) +
  sum(lgamma(a+b)-lgamma(a)-lgamma(b)+lgamma(a+y)+lgamma(b+n-y)- 
   lgamma(a+b+n))
lp <- mapply(lpfun, cA, cB, MoreArgs = list(y, n))

 #Subtract maximum value to avoid over/underflow in exponentiation

 df_marg <- data.frame(x = cA, y = cB, p = exp(lp - max(lp)))

 #Sample from the grid (with replacement)

  nsamp <- 100
  samp_indices <- sample(length(df_marg$p), size = nsamp,
                   replace = T, prob = df_marg$p/sum(df_marg$p))
  samp_A <- cA[samp_indices[1:nsamp]]
  samp_B <- cB[samp_indices[1:nsamp]]
   df_psamp <- mapply(function(a, b, x) dbeta(x, a, b),
               samp_A, samp_B, MoreArgs = list(x = x)) %>%
   as.data.frame() %>% cbind(x) %>% gather(ind, p, -x)

See [this post](https://stackoverflow.com/questions/13676878/fastest-way-to-get-min-from-every-column-in-a-matrix) on finding min (and max) of each column. — Rui Barradas, Oct 17 '20 at 06:00
The code is not reproducible, `samp_A`, `samp_B`, `n` and `y` are missing. — Rui Barradas, Oct 17 '20 at 06:02
@RuiBarradas But I do not have the matrix. I just want to know how to write a loop to get the mean, max, and minimal. — Hermi, Oct 17 '20 at 06:07
To get the means by column, base R has `colMeans`. For mins and maxs, package `matrixStats` functions `colMins` and `colMaxs`. — Rui Barradas, Oct 17 '20 at 06:36
@RuiBarradas Can you tell me how to sample a matrix for my for loop? — Hermi, Oct 17 '20 at 07:23

score 0 · Accepted Answer · answered Oct 17 '20 at 09:33

This is not very well tested.
There is no need for loops to sample from distributions included in base R, those functions are vectorized on their arguments. Code following the lines below should be able to do what the question asks for.

Ni <- 1000
Nj <- 17

theta <- rbeta(Ni*Nj, rep(samp_A + y, each = Ni), rep(samp_B + n - y, each = Ni))
yrep <- rbinom(Ni*Nj, n, theta)
test1 <- matrix(yrep, nrow = Ni)
mins1 <- matrixStats::colMins(test1)

How to write a for loop to compute max of each column for a dataset in R?

1 Answers1

Linked