0

In R,a data set with 30 categories (N cluster=30),in each cluster there are unequal number of units (in ith cluster, there can be 24, 25,26,27, or 28 units). I want to take two stage sampling, first take n cluster from N, second, within these n cluster, randomly take 50% of the units in each selected cluster.

example:

ls=list(4,c(1,1,1,1))
mstage(b,stage=list("cluster","cluster"),varnames=list("REGION","AREA"),
      size=ls, method=c("systematic","systematic"),pik=prob)

my case:

mstage(cs.bb2,stage=list("cluster",""), varnames=list("Team","Team"),
     size=list(12,c(?)),pik=list(rep(0.5,797)), method=list("srswor","srswor"))

my code above does not work. i do not know how to put the second argument of "size="? any correction of my code or alternative solution for this two-stage sampling are appreciated.

exammple: https://www.rdocumentation.org/packages/sampling/versions/2.8/topics/mstage

Grace
  • 173
  • 2
  • 10
  • 1
    Without knowing the structure of your data or even which package `mstage` is from it is difficult to provide meaningful advice. Maybe this can provide some guidance. In my answer here, I created to a function to perform a stratified sampling from a dataframe. https://stackoverflow.com/questions/57924068/how-to-get-around-error-factor-has-new-levels-in-cross-validation-glm/57937180#57937180. This should be able to be adapted to help solve your problem. – Dave2e Nov 19 '19 at 13:58
  • Thanks, I updated your code to fit my case. – Grace Nov 19 '19 at 17:37

0 Answers0