Background
Hi everyone! I am currently conducting a project which requires me to estimate within-study variance through bootstrapping of model residuals, and then calculating the SEE for each sample. This process has to be done on a model-by-model basis.
I have started off by creating a list of dataframes, which are split based on the factor variable model
using the following code list.meta<- split(new.meta, new.meta$model)
where each dataframe contains the data pertaining to a single model. I have supplied a reproducible example below and have limited this to 3 models; however my full dataset contains 13. From there I have two user defined functions: One for calculating the SEE, and another which generates 1000 bootstrapped samples, calculating the SEE for each using the previously defined SEE function. I have supplied both below as well for transparency.
User defined functions
#Define SEE function
SEE<- function(x){
sqrt((sum(x)/(length(x)-2))^2)
}
#Define function for generating bootstrap samples and calculating SEE for each sample
Bootstrap<- function(x){
int<- lapply(1:1000, function(i) sample(x, replace = T))
Calc.SEE<- sapply(int, SEE)
}
Where x is the Residuals
column from a given dataframe 'i'
Data
list(`1` = structure(list(Study = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Model = c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), Residuals = c(26.96774194, 24.35483871, 15.74193548, 15.70967742,
13.22580645, 12.87096774, 11.77419355, 10.67741935, 10.58064516,
8.548387097, 8, 5.548387097, 5.35483871, 5.322580645, 2.612903226,
1.483870968, 1.225806452, 0.258064516)), row.names = c(NA, 18L
), class = "data.frame"), `2` = structure(list(Study = c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), Model = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L), Residuals = c(20.19354839, 16.5483871, 15.74193548,
14.61290323, 7.064516129, 6.580645161, 5.64516129, 4.580645161,
4.612903226, 3.612903226, 3.35483871, 2.741935484, 2.419354839,
1.64516129, 1.35483871, 1.903225806, 0.516129032)), row.names = 19:35, class = "data.frame"),
`3` = structure(list(Study = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Model = c(3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), Residuals = c(23.80645161, 17.41935484, 15.58064516, 13.22580645,
11.32258065, 10.4516129, 6.709677419, 6.193548387, 5.741935484,
4.870967742, 4.322580645, 2.709677419, 2.677419355, 1.032258065,
1.129032258, 0.451612903, 1.064516129)), row.names = 36:52, class = "data.frame"))
problem/question
So, here's my problem: I need to apply the bootstrap function to the residuals
column of each model, with the output eventually being a list of length 13 (where each element of the list is a vector comprised of 1000 SEE values) or as a dataframe/matrix with 13 columns and 1000 rows (the second is preferable as it will be used for further analyses and the package takes as input a dataframe).
I imagine one of the best ways to do this would be either through a for
loop or though one of functions from the apply
family. However, as far as syntax goes, I have no idea how to actually apply the function to a specific column of each dataframe when these are nested in a list format
What I have tried
- Attempt one was to use the
lapply
function.
dat<- lapply(na.omit(new.data[[i]][, 4]), Bootstrap)
The [[i]][, 4]
is my attempt at telling R to use the data from the fourth column from the ith element of the list. This partially worked but returned a list of length 18? Some of list elements also made no sense.
- The second option i'm working on is to use a for loop:
for (i in 1:seq_along(new.data)){
result<- Bootstrap(new.data[[i]][,4])
return(result)
}
but this returns an error
In 1:seq_along(new.data) :
numerical expression has 13 elements: only the first used
I also have no idea how to actually save the results into list or matrix format and my for
loop skills could use more work... so there's that.
There's probably going to a very simple answer to this, so thank you in advance for any and all suggestions. I really need to up my hours practicing coding :)