1

I am attempting to block bootstrap a dataset using R. I have a data frame of firms in counties. I want to sample counties with replacement, then build a dataset with all firms in that sample of counties (with replacement). I run a regression on the new dataset. Then I sample again.

I have a for loop that works like so:

for(j in 1:10000){
y=NULL
for(i in 1:length(unique(data$firm_id))){
    y=rbind(y, data[which(data$county_id==sample(unique(data$county_id), replace=T)[i]),])
}
    a=rbind(a, lm(profit~employees, data=y)$coefficients)
}

Unfortunately, this sort of for loop in R is extremely slow and computationally expensive. Is it possible to implement this using a more efficient apply function?

DannyMatt
  • 95
  • 1
  • 2
  • 6

1 Answers1

2

something like this could help:

positions<-replicate(1000, sample(1:nrow(df), nrow(df), T))

apply(positions, 2, function(i) lm(yvar[i]~xvar[i], df)$coef)
Davide Passaretti
  • 2,741
  • 1
  • 21
  • 32