0

I would like to run a bootstrap of a weighted mean in a for loop (I don’t think I can use ‘apply’ because it concerns a weighted mean). I would only need to store the resulting standard errors in a dataframe. Another post provided the code for how to calculate the weighted mean in a bootstrap (bootstrap weighted mean in R), and works perfectly:

library(boot)
 
mtcarsdata = mtcars #dataframe for data 
mtcarsweights = rev(mtcars) #dataframe for weights

samplewmean <- function(d, i, j) {
  d <- d[i, ]
  w <- j[i, ]
  return(weighted.mean(d, w))   
}

results_qsec <- sd(boot(data= mtcarsdata[, 6, drop = FALSE], 
                     statistic = samplewmean, 
                     R=10000, 
                     j = mtcarsweights[, 6 , drop = FALSE])[[2]], na.rm=T)
results_qsec

To then run it in a loop, I tried:

outputboot = matrix(NA, nrow=11, ncol=1)
for (k in 1:11){
  outputboot[1,k] = sd(boot(data= mtcarsdata[, k, drop = FALSE], 
                       statistic = samplewmean, 
                       R=10000, 
                       j = mtcarsweights[, k, drop = FALSE])[[2]], na.rm=T)
}
outputboot

But this doesn't work. The first output isn’t even correct. I suspect the code can’t work with two iterators: one for looping over the columns and the other for the sampling with replacement.

I hope anyone could offer some help.

Anne
  • 19
  • 4
  • So you basically want to not only bootstrap one column, but all mtcars columns and get their weighted mean? – deschen Dec 10 '21 at 08:29
  • Whenever you can calculate sth. in one step for each column you can use `apply`, `map` and friends. This should be the case for the weighted mean – danlooo Dec 10 '21 at 08:29

1 Answers1

1

This will calculate the standard deviation of all bootstraps for each column of the table mtcarsdata weighted by mtcarsweights.

Since we can calculate the result in one step, we can use apply and friends (here: purrr:map_dbl)

library(boot)
library(purrr)

set.seed(1337)

mtcarsdata <- mtcars # dataframe for data
mtcarsweights <- rev(mtcars) # dataframe for weights

samplewmean <- function(d, i, j) {
  d <- d[i, ]
  w <- j[i, ]
  return(weighted.mean(d, w))
}

mtcarsdata %>%
  ncol() %>%
  seq() %>%
  map_dbl(~ {
    # .x is the number of the current column
    sd(boot(
      data = mtcarsdata[, .x, drop = FALSE],
      statistic = samplewmean,
      R = 10000,
      j = mtcarsweights[, .x, drop = FALSE]
    )[[2]], na.rm = T)
  })
#>  [1]  0.90394218  0.31495232 23.93790468  6.34068205  0.09460257  0.19103196
#>  [7]  0.33131814  0.07487754  0.07745781  0.13477355  0.27240347

Created on 2021-12-10 by the reprex package (v2.0.1)

danlooo
  • 10,067
  • 2
  • 8
  • 22