I would like to bootstrap a large data set which contains multiple column and row variables. The following is a simplified re-creation of my data set:
charDataDiff <- data.frame(c('A','B','C'), matrix(1:72, nrow=9))
colnames(charDataDiff) <- c("patchId","s380","s390","s400","s410","s420","s430","s440","s450")
Separate the data using the patchId
as the criteria. This creates three lists: one for each Variable
idColor <- c("A", "B", "C")
(patchSpectrum <- lapply(idColor, function(idColor) charDataDiff[charDataDiff$patchId==idColor,]))
Created the function sampleBoot
to sample the patchSpectrum
sampleBoot <- function(nbootstrap=2, patch=3){
return(lapply(1:nbootstrap, function(i)
{patchSpectrum[[patch]][sample(1:nrow(patchSpectrum[[patch]]),replace=TRUE),]}))}
Example:
sampleBoot(5,3)
Here is where I am stuck:
- I need to sample each
patchId
list along with each column variable (which the above "sampleBoot" easily accomplish), - Take the median of each
patchId
sampling list iteration, and - Create a new population of the medians to calculate parametric parameters. I can do it manually but that would be silly.