0

I am trying to bootstrap in R but I can not figure out the right code. I have a set of 30 data points, either "yes" or "no". I was able to create 1000 bootstrap samples and retrieve the proportion of "yes" in each sample. Basically, if you convert the data to binary ("yes" = 1 and "no" = 0), then I was the mean of each bootstrap sample. I have tried both character and numeric data. Each time, the bootstrap returns 1000 values that are either 0 or 1/30. Clearly, there's no way that each bootstrapped sample only contains 1 or 0 "yes". I also can't figure out how to change the seed, so I keep getting the same numbers each time.

[count_yes <- function(d, i) {
  d2 <- d[i,]
  return(sum(d2$Empathy == "yes") / 30)
}

#create bootstrap
bootobject <- boot(CompassionateRatsFull, count_yes, R=1000)

fc <- function (data, indices) {
  return(mean(data[indices]) / 30)
}

bootobject <- boot(ComRatsNum, fc, R=1000)][1]
alan ocallaghan
  • 3,116
  • 17
  • 37
  • 1
    Could you please provide sample data you are working with? Why are you wrapping everything in squared brackets? You can set a specific seed with `set.seed`. Further you should add the packages you load, I guess here it is `boot`. – kath Oct 01 '19 at 06:48
  • In the second function, fc, I would have taken the mean of `data[indices,”Empathy”]`, Do stop naming your parameters `data`.As stated by kath, we need a date object. See [How to make a great reproducible example in R] – IRTFM Oct 01 '19 at 06:58
  • As @kath mentioned, please provide package used. You can also check yourself if your bootstrapping function has to be set to binary. You see specifications of the function with `?boot`. – Christoffer Sannes Oct 01 '19 at 06:59
  • @kath I am using a data set provided by Lock 5 Data. It can be found here http://www.lock5stat.com/datapage.html and is called Compassionate Rats. It is a set of 30 binomials (x,y) where x is the gender and y is whether the rat performed the function in question. For the purposes of my code, I removed the Gender column because it is not needed. Therefore, the only column is "Empathy" and it contains a list of 30 "yes" or "no" data points. The naming and bracket conventions I have used are what I have found in every example of bootstrapping online. – anonymous42069 Oct 01 '19 at 14:56
  • You should have a look at https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?rq=1. Share your data (`CompassionateRatsFull` and `ComRatsNum`) using dput and include the packages you are using. Please edit your question with further details. At the moment the way your code is written it is not working (mainly because of the squared brackets...) – kath Oct 01 '19 at 15:19

0 Answers0